The Phenomics Ontology Driven Data repository (PODD) project is a two year data management software development project, funded by the National eResearch Infrastructure Taskforce (NeAT). The PODD system is a web based data repository developed to meet the data management requirements of the NCRIS 5.2 funded phenomics facilities: The Australian Plant Phenomics Facility (APPF); and the Australian Phenomics Network (APN). These requirements are to capture, manage, secure, distribute and publish raw and analysed data from the phenotyping platforms run by these facilities, as well as capture sufficient contextual information (metadata) to support data discovery and analysis services. In turn, PODD will provide metadata to the Atlas of Living Australia (ALA), so that the information generated by the APN and the APPF may be represented in the ALA as scientific reference collections.
The PODD system will be an online repository that supports data lodgement and retrieval either manually via a web based user interface or via internet based automated processes. PODD utilises Fedora Commons repository software to manage, preserve and link data and metadata, which will primarily reside on ARCS provided data storage. The internal structure of the data and metadata is modelled using OWL/RDFS, a language used for expressing ontologies. An ontology is a common vocabulary about a particular topic, in this case phenomics, and the PODD structure therefore becomes an ontology for phenomics. This approach allows PODD to model multiple phenomics processes without enforcing a rigid data structure. In addition, other community ontologies, such as plant, mammal and gene ontologies, can be used to annotate the data. This not only supports data discovery, but provides common reference points through which PODD managed data can be integrated with data in other research databases, both domestic and international. PODD organises the data into projects; at the project level PODD manages data security and allows data publication, meaning data is not available for public access until the relevant project is complete and the data is to be published. PODD also utilises Persistent Identifiers to identify published data and provide persistent WWW links to the data, which in turn, can be published in journal articles. Two separate instances of PODD will be operating, one for the APN and one for the APPF. Each instance will manage data relevant to the research domain (plant or mouse models) with no overlap or linkage between the two. Both instances of PODD will also provide metadata for publicly accessible data directly to the Atlas of Living Australia in the required ALA format.
For further information about PODD projects and services please contact Dr Warren Creemers.