The Australian Plant Phenomics Facility is working to meet the highest standards for research data management.
Crop phenomics is about characterising plant behaviour in a particular environment and fundamentally involves measuring variables, which means generating and managing data is central to what we do.
Data architect Rakesh David and APPF Data Management Director Donald Hobern are part of the APPF Central team. They are working to develop guidelines for researchers, so the data generated by our various projects is universally Findable, Accessible, Interoperable and Reusable, or ‘FAIR’.
These FAIR principles are an international benchmark for data management.
“APPF is developing a national data strategy and architecture for end-to-end FAIR data management across our network,” Dr David says.
“The goal is to convert APPF’s research outputs into more useful national and international digital assets.”
Dr David’s own background is in plant science, and he approaches APPF’s data architecture from a researcher’s perspective rather than as a software challenge.
‘Traditionally, research data has tended to be developed as a single use case and stored in distinct data silos,” he says.
“However, it is now much more common for research organisations, phenomics facilities, funding bodies and industry to collaborate – including across borders and over time.
“Researchers need to be able to locate and identify relevant datasets so they can compare or integrate them with their own projects.”
He says metadata is the key to building a standardised index that makes it easy to identify the provenance, processes and content of research datasets. This can be enhanced by a data hierarchy that lets researchers know if the asset is (for example) raw data from sensors or cameras, derived data such as measurements of particular traits, or a record of multiple traits and results that answer a specific research question.
However, he is also mindful of variations between different research projects and environments.
“We have to appreciate that the each APPF node has its own areas of specialisation and the host organisations have their own standards and protocols for data management,” he says.
“So one of our key tasks is helping research teams achieve universal FAIR standards without compromising their other requirements.”
As an NCRIS facility, APPF is part of the national research ecosystem, which will be strengthened by having standardised policies for data storage, sharing and collaboration.
APPF has also joined the Australian Research Data Commons (ARDC) Data Retention Program that aims to maximise the impact of Australia’s nationally significant data collections by establishing a national digital research infrastructure.
By partnering with this Program, all of APPF’s important datasets can be enriched using best practice metadata to ensure they are easily findable and accessible for future researchers.
For researchers, optimising their data architecture from the start of a project is the easiest approach – and the benefits can be far-reaching.
As research and funding organisations become attuned to the FAIR data principals – and the abiding value of integrating research data – building universal data management processes into research projects can help make funding applications more attractive and competitive.
Applying standardised data management processes can also make it much easier to collaborate with researchers at other nodes within the APPF and phenomics facilities around the world. Bringing universality to the data sets developed by different research teams will streamline how data can be shared, compared, combined, integrated and manipulated to create much larger and more meaningful information outcomes.
“Researchers should begin by engaging with the APPF Central data team to discuss how their research data will be managed throughout its lifecycle,” Dr David says.
“It is important to first identify all data streams for a research project, be it in a controlled environment facility, a field, or via mobile phenotyping platforms such as drones or UAVs,” he says.
“Then we can develop a strategy to capture, store and manage the data from these different sources.”
APPF plays an important role in helping the agricultural research community develop new and improved crops. Making it easier to aggregate and transform our datasets will benefit researchers, growers and, ultimately, consumers.