Difference between revisions of "Notes on PGDSO revisions fall 2011"

From Plant Ontology Wiki
Jump to navigationJump to search
Line 12: Line 12:
 
The proposal described here is simpler than December's (does not include biological processes, which could be in GO), but follow many of the same principals.
 
The proposal described here is simpler than December's (does not include biological processes, which could be in GO), but follow many of the same principals.
  
 +
=Plan for revising the PGDSO, fall 2011=
  
 +
Problems with current PGDSO:
 +
1. Names:
 +
a. many are process names, rather than phase names (e.g., senescence, rather than senescence phase)
 +
b. numbers in names (e.g., BO.02 mid boot stage)
 +
c. plural names (e.g., root development stages)
  
 +
2. Definitions:
 +
a. most (maybe all) have textual definitions, but these are not in the genus differentia form
 +
b. many definitions, as with names, describe a process, rather than a phase (e.g,. 6 ripening: Maturation of the fruit.)
  
 +
3. Ontology structure
 +
a. limited upper level hierarchy that is angiospermcentric
 +
b. existing leaf terms mostly for angiosperms
 +
c. is_a and part_of relations have not been used consistently for subtypes
  
 +
Goals:
 +
1. Compliance with OBO Foundry guidelines and best practices
 +
a. provide is-a parents for all terms (only a few left)
 +
b. provide accurate names for terms
 +
c. provide accurate text definitions for all terms in genus-differentia form
  
 +
2. Alignment with BFO
 +
a. define root node in relation to a BFO or other upper-level ontology term (is this possible? need to investigate)
 +
b. all subtypes in PGDSO should be occurents (already are, but of the wrong type)
 +
c. all subtypes in PGDSO should be growth and development stages, and have names and definitions that reflect that
  
 +
3. Provide growth and development stages that allow for annotation of all plant species
 +
a. enhance top level of hierarchy, as was done for PAO (more terms, more general defintions)
 +
b. populate mid levels with general growth and development phases/stages that can be used for any plant
 +
c. use a and b to provide a framework for taxon specific leaf terms; incorporate existing low-level terms into that framework
 +
d. develop a set of landmarks that can be used to describe phase transitions, making phases comparable across taxa
 +
 +
4. Develop a strategy for linking PAO and PGDSO terms
 +
a. determine the most appropriate relations and how to implement them (may need to create our own compound relations)
 +
 +
5. Develop PGDSO terms needed to describe plant phenotypes/traits
 +
 +
Strategy
 +
 +
In my opinion, the PGDSO should restrict its domain to growth and development stages. Growth and development processes (and other biological processes in plants) are already in the domain of GO, and should be left there.  There is no reason we can’t suggest new terms to GO, if we feel they are missing. Although some PGDSO terms seem to reflect processes (e.g., imbibition, senescence, anthesis), I think these terms were intended to describe the phase during which the process occurs, not the process itself. Terms for phases are what are needed by the genetics and genomics communities to describe the temporal aspect of tissue sampling. Given our limited time and resources, I suggest that we stay focused on our existing domain, and not try to expand it too much at this point.
 +
 +
Goals 1 and 2 could be accomplished fairly easily, without the addition of any new terms, once we agree on policies for creating names and definitions. Most of this can be based on our experience with the PAO. Goals 3 and 4 will require the most work and discussion. These should be the focus of our NYBG meeting, particularly goal 3. Goal 4 will probably require input from CM, and we will need additional meetings with him.  Goal 5 is mostly important for future work, and does not require much immediate attention, but we should keep it in mind as we work through the other goals.
 +
 +
Specific tasks:
 +
 +
1. Clarify the domain of the PGDSO and define the root-level term accordingly
 +
a. if possible, refer to BFO or other upper-level ontology
 +
 +
2. Develop and implement a sound naming policy
 +
a. choose between “stage” and “phase”
 +
b. decide whether or not to eliminate letters and numbers from term names
 +
c. most naming problems can be remedied pretty easily (add stage or phase to the end of the name)
 +
 +
3. Develop the top levels of the PGDSO
 +
a. determine what categories we will need to provide ancestors for all existing lower-level terms
 +
b. determine what any new categories we want to include (e.g., life cycle boundary/landmark)
 +
c. determine tree structure for top-level terms
 +
d. write definitions of top-level terms that are general enough to encompass all plants
 +
 +
4. Work existing upper- to mid-level terms into the hierarchy determined in step 3.
 +
a. write genus-differentia definitions for these terms, and check for generality
 +
 +
5. Populate any new top-level categories, such as life cycle boundary/transition
 +
a. we have notes/suggestions for this from our last meeting
 +
b. first focus on major transitions and those that can be clearly defined; more ambiguous sub-divisions can be added as time allows
 +
 +
6. Add new mid-level terms need to describe growth or development stages for non-angiosperm plants
 +
 +
7. Determine which relations are needed to link the PAO and PGDSO
 +
a. participates_in and variants
 +
b. has_participant and variants
 +
 +
 +
Suggestions/notes for specific tasks:
 +
 +
Root term
 +
1. plant growth and development stage/phase
 +
2. current definition:  The succession of changes leading from the zygote to the mature plant.
 +
a. This definition describes a growth and development process, not a phase, doesn’t match the name
 +
3. Proposed definition:
 +
 +
Naming policies
 +
1. Stage versus phase
 +
a. Barry prefers phase (I don’t remember why)
 +
b. Stage is already in the name. If we change to phase, it would become the PGDPO, which might lead to confusion, but I don’t know how many people outside PO internal use PGDSO
 +
2. Numbers and letters in name
 +
a. make browsing easier, because terms are in temporal rather than alphabetic order
 +
b. names without letters and numbers can be added as exact synonyms
 +
c. names with letters and numbers are confusing when terms are viewed in isolation
 +
3. Whole plant growth stage names
 +
a. Current names for some of these are confusing, even though the definitions and comments in them clarify. For example: PO:0007016 4 flowering. If we rename this “flowering phase”, it will not be clear from the name alone that it represents a whole plant growth phase.
 +
b. Suggest renaming it something like “whole plant flowering phase”
 +
c. PO:0007130 B reproductive growth would be “whole plant reproductive phase”, etc.
 +
4. additional naming suggestions on wiki
 +
 +
 +
Suggested top-level terms:
  
 
=Open SourceForge trackers for the PGDSO=
 
=Open SourceForge trackers for the PGDSO=

Revision as of 18:14, 2 September 2011

Previous proposal

See POC_Conf._Call_12-15-10#Restructuring_PGDSO_for_Occurents for a description of a proposal that BS, LC and RW worked on in December.

Also see: http://palea.cgrb.oregonstate.edu/viewsvn/Poc/branches/PGDSO_2011/plant_ontology_test.obo?view=log for a copy of the test ontology file and changes in new PGDSO version for a summary of changes that were made to the test file.

In that proposal, the root term, plant growth and development stage, would be replaced by 'plant life cycle process' (an "Occurrent"). The top level structure would have three branches, as below:

Life cycle process 2.jpg


The proposal described here is simpler than December's (does not include biological processes, which could be in GO), but follow many of the same principals.

Plan for revising the PGDSO, fall 2011

Problems with current PGDSO: 1. Names: a. many are process names, rather than phase names (e.g., senescence, rather than senescence phase) b. numbers in names (e.g., BO.02 mid boot stage) c. plural names (e.g., root development stages)

2. Definitions: a. most (maybe all) have textual definitions, but these are not in the genus differentia form b. many definitions, as with names, describe a process, rather than a phase (e.g,. 6 ripening: Maturation of the fruit.)

3. Ontology structure a. limited upper level hierarchy that is angiospermcentric b. existing leaf terms mostly for angiosperms c. is_a and part_of relations have not been used consistently for subtypes

Goals: 1. Compliance with OBO Foundry guidelines and best practices a. provide is-a parents for all terms (only a few left) b. provide accurate names for terms c. provide accurate text definitions for all terms in genus-differentia form

2. Alignment with BFO a. define root node in relation to a BFO or other upper-level ontology term (is this possible? need to investigate) b. all subtypes in PGDSO should be occurents (already are, but of the wrong type) c. all subtypes in PGDSO should be growth and development stages, and have names and definitions that reflect that

3. Provide growth and development stages that allow for annotation of all plant species a. enhance top level of hierarchy, as was done for PAO (more terms, more general defintions) b. populate mid levels with general growth and development phases/stages that can be used for any plant c. use a and b to provide a framework for taxon specific leaf terms; incorporate existing low-level terms into that framework d. develop a set of landmarks that can be used to describe phase transitions, making phases comparable across taxa

4. Develop a strategy for linking PAO and PGDSO terms a. determine the most appropriate relations and how to implement them (may need to create our own compound relations)

5. Develop PGDSO terms needed to describe plant phenotypes/traits

Strategy

In my opinion, the PGDSO should restrict its domain to growth and development stages. Growth and development processes (and other biological processes in plants) are already in the domain of GO, and should be left there. There is no reason we can’t suggest new terms to GO, if we feel they are missing. Although some PGDSO terms seem to reflect processes (e.g., imbibition, senescence, anthesis), I think these terms were intended to describe the phase during which the process occurs, not the process itself. Terms for phases are what are needed by the genetics and genomics communities to describe the temporal aspect of tissue sampling. Given our limited time and resources, I suggest that we stay focused on our existing domain, and not try to expand it too much at this point.

Goals 1 and 2 could be accomplished fairly easily, without the addition of any new terms, once we agree on policies for creating names and definitions. Most of this can be based on our experience with the PAO. Goals 3 and 4 will require the most work and discussion. These should be the focus of our NYBG meeting, particularly goal 3. Goal 4 will probably require input from CM, and we will need additional meetings with him. Goal 5 is mostly important for future work, and does not require much immediate attention, but we should keep it in mind as we work through the other goals.

Specific tasks:

1. Clarify the domain of the PGDSO and define the root-level term accordingly a. if possible, refer to BFO or other upper-level ontology

2. Develop and implement a sound naming policy a. choose between “stage” and “phase” b. decide whether or not to eliminate letters and numbers from term names c. most naming problems can be remedied pretty easily (add stage or phase to the end of the name)

3. Develop the top levels of the PGDSO a. determine what categories we will need to provide ancestors for all existing lower-level terms b. determine what any new categories we want to include (e.g., life cycle boundary/landmark) c. determine tree structure for top-level terms d. write definitions of top-level terms that are general enough to encompass all plants

4. Work existing upper- to mid-level terms into the hierarchy determined in step 3. a. write genus-differentia definitions for these terms, and check for generality

5. Populate any new top-level categories, such as life cycle boundary/transition a. we have notes/suggestions for this from our last meeting b. first focus on major transitions and those that can be clearly defined; more ambiguous sub-divisions can be added as time allows

6. Add new mid-level terms need to describe growth or development stages for non-angiosperm plants

7. Determine which relations are needed to link the PAO and PGDSO a. participates_in and variants b. has_participant and variants


Suggestions/notes for specific tasks:

Root term 1. plant growth and development stage/phase 2. current definition: The succession of changes leading from the zygote to the mature plant. a. This definition describes a growth and development process, not a phase, doesn’t match the name 3. Proposed definition:

Naming policies 1. Stage versus phase a. Barry prefers phase (I don’t remember why) b. Stage is already in the name. If we change to phase, it would become the PGDPO, which might lead to confusion, but I don’t know how many people outside PO internal use PGDSO 2. Numbers and letters in name a. make browsing easier, because terms are in temporal rather than alphabetic order b. names without letters and numbers can be added as exact synonyms c. names with letters and numbers are confusing when terms are viewed in isolation 3. Whole plant growth stage names a. Current names for some of these are confusing, even though the definitions and comments in them clarify. For example: PO:0007016 4 flowering. If we rename this “flowering phase”, it will not be clear from the name alone that it represents a whole plant growth phase. b. Suggest renaming it something like “whole plant flowering phase” c. PO:0007130 B reproductive growth would be “whole plant reproductive phase”, etc. 4. additional naming suggestions on wiki


Suggested top-level terms:

Open SourceForge trackers for the PGDSO

See Items_for_future_meetings#User_requests_still_open_on_Source_Forge.3B_PGDSO for a complete list.

PGDSO terms without is_a parents

imbibition (PO:0007022)

seedling growth (PO:0007131)

shoot emergence (PO:0007130)

'plant life' ('life of whole plant')

Barry suggested that we need a term like "life of whole plant", of which all phases would form parts.

New relations linking the PAO and the PGDSO

derives_from

At the POC meeting on 7-19-11, we agreed to create a new relation in the PO called derives_by_manipulation_from (see POC_Conf._Call_7-19-11#derives_from_relations_in_PO:). This would be a special case of the RO relation derives_from.


participates_in and has_participant

See POC_Conf._Call_7-19-11#participates_in_relation


Chris asked: "will you add temporal relations to the stages?"