Notes on PGDSO revisions fall 2011
Plan for revising the PGDSO, fall 2011
The current PGDSO is a mixture of processes and phases (although I think it was intended to be all phases), and was designed with angiosperms in mind.
We need to correct some specific problems with the PGDSO, and to make it applicable to all plants, as we have done withe PAO.
Problems with current PGDSO
1. Names:
- many are process names, rather than phase names (e.g., senescence, rather than senescence phase)
- numbers in names (e.g., BO.02 mid boot stage)
- plural names (e.g., root development stages)
2. Definitions:
- most (maybe all) have textual definitions, but these are not in the genus differentia form
- many definitions, as with names, describe a process, rather than a phase (e.g,. 6 ripening: Maturation of the fruit.)
- top level terms need better definitions
3. Ontology structure
- limited upper level hierarchy that is angiosperm-centric
- existing leaf terms mostly for angiosperms
- is_a and part_of relations have not been used consistently for subtypes
Goals
1. Compliance with OBO Foundry guidelines and best practices
- provide is-a parents for all terms (only a few left)
- provide accurate names for terms
- provide accurate text definitions for all terms in genus-differentia form
2. Alignment with BFO
- define root node in relation to a BFO or other upper-level ontology term (is this possible? need to investigate)
- all subtypes in PGDSO should be occurents (already are, but of the wrong type)
- all subtypes in PGDSO should be growth and development stages, and have names and definitions that reflect that
3. Provide growth and development stages that allow for annotation of all plant species
- enhance top level of hierarchy, as was done for PAO (more terms, more general definitions)
- populate mid levels with general growth and development phases/stages that can be used for any plant
- use a and b to provide a framework for taxon specific leaf terms; incorporate existing low-level terms into that framework
- develop a set of landmarks that can be used to describe phase transitions, making phases comparable across taxa - some of these may need to be in GO
4. Develop a strategy for linking PAO and PGDSO terms
- determine the most appropriate relations and how to implement them (may need to create our own compound relations)
5. Develop PGDSO terms needed to describe plant phenotypes/traits
Strategy
In my opinion, the PGDSO should restrict its domain to growth and development stages, plus any landmarks needed to describe transitions between phases (some which are in GO already, e.g., GO:0007126 meiosis, GO:0009845 seed germination). Growth and development processes (and other biological processes in plants) are already in the domain of GO, and should be left there. There is no reason we can’t suggest new terms to GO, if we feel they are missing. Although some PGDSO terms seem to reflect processes (e.g., imbibition, senescence, anthesis), I think these terms were intended to describe the phase during which the process occurs, not the process itself. Furthermore, terms for phase are what are needed by the genetics and genomics communities to describe the temporal aspect of tissue sampling. Given our limited time and resources, I suggest that we stay focused on our existing domain, and not try to expand it too much at this point.
Goals 1 and 2 could be accomplished fairly easily, without the addition of any new terms, once we agree on policies for creating names and definitions. Most of this can be based on our experience with the PAO. Goals 3 and 4 will require the most work and discussion. These should be the focus of our NYBG meeting, particularly goal 3. Goal 4 will probably require input from CM, and we will need additional meetings with him. Goal 5 is mostly important for future work, and does not require much immediate attention, but we should keep it in mind as we work through the other goals.
Specific tasks
We should try to work on tasks 1-5 at the POC meeting at the NYBG, even if we can't finish them all. Tasks 6 and 7 can be handled over conference calls.
1. Clarify the domain of the PGDSO and define the root-level term accordingly
- if possible, refer to BFO or other upper-level ontology
2. Develop and implement a sound naming policy
- choose between “stage” and “phase”
- decide whether or not to eliminate letters and numbers from term names
- most naming problems can be remedied pretty easily (add stage or phase to the end of the name)
3. Develop the top levels of the PGDSO
- determine what categories we will need to provide ancestors for all existing lower-level terms
- determine what any new categories we want to include (e.g., life cycle boundary/landmark)
- determine tree structure for top-level terms
- write definitions of top-level terms that are general enough to encompass all plants
4. Work existing upper- to mid-level terms into the hierarchy determined in step 3.
- write genus-differentia definitions for these terms, and check for generality
5. Populate any new top-level categories, such as life cycle boundary/transition
- we have notes/suggestions for this from our last meeting
- first focus on major transitions and those that can be clearly defined; more ambiguous sub-divisions can be added as time allows
6. Add new mid-level terms need to describe growth or development stages for non-angiosperm plants
7. Determine which relations are needed to link the PAO and PGDSO
- participates_in and variants
- has_participant and variants
- others
Suggestions/notes for specific tasks
Root term(s)
plant growth and development stage/phase
current definition: The succession of changes leading from the zygote to the mature plant.
This definition describes a growth and development process, not a phase, doesn’t match the name. Also, it only covers the sporophytic phase.
Proposed definition:
Do we need a higher level terms that includes both growth and development stages as well as transitions or other processes?
Naming policies
Stage versus phase
- Barry prefers phase (I don’t remember why)
- Stage is already in the name. If we change to phase, it would become the PGDPO, which might lead to confusion, but I don’t know how many people outside PO internal use PGDSO
Numbers and letters in name
- make browsing easier, because terms are in temporal rather than alphabetic order
- names without letters and numbers can be added as exact synonyms
- names with letters and numbers are confusing when terms are viewed in isolation
Whole plant growth stage names
Current names for some of these are confusing, even though the definitions and comments in them clarify. For example: PO:0007016 4 flowering. If we rename this “flowering phase”, it will not be clear from the name alone that it represents a whole plant growth phase.
Suggest renaming it something like “whole plant flowering phase”
PO:0007130 B reproductive growth would be “whole plant reproductive phase”, etc. 4. additional naming suggestions on wiki
Suggested top-level hierarchy
Open SourceForge trackers for the PGDSO
See Items_for_future_meetings#User_requests_still_open_on_Source_Forge.3B_PGDSO for a complete list.
PGDSO terms without is_a parents
imbibition (PO:0007022)
seedling growth (PO:0007131)
shoot emergence (PO:0007130)
'plant life' ('life of whole plant')
Barry suggested that we need a term like "life of whole plant", of which all phases would form parts.
Previous proposal
See POC_Conf._Call_12-15-10#Restructuring_PGDSO_for_Occurents for a description of a proposal that BS, LC and RW worked on in December.
Also see: http://palea.cgrb.oregonstate.edu/viewsvn/Poc/branches/PGDSO_2011/plant_ontology_test.obo?view=log for a copy of the test ontology file and changes in new PGDSO version for a summary of changes that were made to the test file.
In that proposal, the root term, plant growth and development stage, would be replaced by 'plant life cycle process' (an "Occurrent"). The top level structure would have three branches, as below:
The proposal described here is simpler than December's (does not include biological processes, which could be in GO), but follow many of the same principals.
New relations linking the PAO and the PGDSO
derives_from
At the POC meeting on 7-19-11, we agreed to create a new relation in the PO called derives_by_manipulation_from (see POC_Conf._Call_7-19-11#derives_from_relations_in_PO:). This would be a special case of the RO relation derives_from.
participates_in and has_participant
See POC_Conf._Call_7-19-11#participates_in_relation
Chris asked: "will you add temporal relations to the stages?"