Difference between revisions of "PO Annotation Extensions (column 16)"

From Plant Ontology Wiki
Jump to navigationJump to search
m
 
(45 intermediate revisions by 2 users not shown)
Line 1: Line 1:
'''This page is under development, and the policies herein have not yet been approved by the Plant Ontology Consortium or curators.'''
+
=Introduction=
 +
* This page describes the proper use of column 16 in the annotation file (see [http://www.geneontology.org/GO.format.gaf-2_0.shtml GAF 2.0 format] and [[Annotation_Association_File_Format]]), which allows additional terms to be specified to extend the meaning of an annotation.  
  
The use of annotation extensions in the PO is based on the usage developed by the GO. A discussion of GO's annotation extension column can be found on the [http://wiki.geneontology.org/index.php/Annotation_Cross_Products GO wiki].
+
* Each annotation line in the file pairs a single term from the PO to a single gene product or other data object. Using column 16 allows annotators to add additional information by '''combining additional terms in a single annotation'''. These terms can even come from other [http://obofoundry.org OBO ontologies], or they could be gene products.
  
Each PO annotation pairs a single gene product or other object to a single term from the PO. This restricts annotators in what they can say - there must be a pre-existing term in the ontology, or one must be requested. It would be far less restrictive if the annotator could '''combine additional terms in a single annotation'''. These terms could even come from other [http://obofoundry.org OBO ontologies], or they could be gene products.
+
* The use of annotation extensions in the PO is based on the usage developed by the GO. A discussion of GO's annotation extension column can be found on the [http://wiki.geneontology.org/index.php/Annotation_Extension GO Annotation_Extension Page].
  
This page describes column 16 in the gene annotation file (see [http://www.geneontology.org/GO.format.gaf-2_0.shtml GAF 2.0 format] and [[Annotation_Association_File_Format]]), which allows additional terms to be specified to extend the meaning of an annotation. If and when an annotator chooses to do this, they are effectively creating an "on-the-fly" [http://wiki.geneontology.org/index.php/Category:Cross_Products cross product] term. We say "on-the-fly" because the combinatorial term is not added to the ontology (although it could be at a later stage, if the ontology editors choose to do do).
+
* If and when an annotator chooses to use column 16, they are effectively creating an "on-the-fly" [http://wiki.geneontology.org/index.php/Category:Cross_Products cross product] term. - ''Need to clarify this...''
  
 
=Uses=  
 
=Uses=  
There are many possible cases where one might with to create annotation extensions. For now, the PO limits the use of annotation extensions to links to other PO terms and to a limited set of relation. Annotators wishing to create other types of extensions in column 16 should contact the POC curators.
+
* There are many possible cases where one might wish to create annotation extensions. For now, the '''PO limits the use of annotation extensions to links to other PO terms and to a limited set of relations (see below).'''
 +
* Annotators wishing to create other types of extensions in column 16 should contact the POC curators.
  
 
==Annotation extensions to other PO terms==
 
==Annotation extensions to other PO terms==
  
Column 16 should be used when the appropriated part_of or participates_in relation is not specified or cannot be inferred within the PO, for example, for creating specific part_of relations when only a general part_of relation exists or for linking a structure to a specific growth or developmental stage.
+
* Column 16 should be used when the appropriate ''part_of'' or ''participates_in'' relation is not specified, or cannot be inferred within the PO, for example, for creating specific ''part_of'' relations when only a general ''part_of'' relation exists, or for linking a structure to a specific growth or developmental stage.
  
===Plant structures that are '''part_of''' another plant structure===
+
'''1. Plant structures that are ''part_of'' another plant structure'''
  
The anatomical structure branch of the PO is structured with general categories of structures that are parents to more specific children. For example, vascular leaf and non-vascular leaf are children of leaf, and aerial tuber and subterranean tuber are children of tuber. Any plant structure that can only be a part of one of the specific children will have a part_of relation only to that specific child (for example, costa part_of non-vascular leaf and leaf vascular system part_of vascular leaf), while plant structures that can be part of more than one type of the specific children will have a part_of relation to the more general term (for example, leaf tip part_of leaf).
+
* The ''plant anatomical entity'' (PAE; PO:0025131) branch of the PO is structured with general categories of structures that are parents to more specific child terms.  
 +
* Examples,  
 +
** ''vascular leaf'' and ''non-vascular leaf'' are children of ''leaf''
 +
** ''aerial tuber'' and ''subterranean tuber'' are children of ''tuber''
  
In order to prevent massive term inflation, PO curators prefer not to pre-compose all of the possible combinations of specific is_a and part_of children of more general term. For example, leaf has 15 is_a descendents (juvenile leaf, transition leaf, adult leaf, transition leaf, non-vascular leaf, vascular leaf, simple leaf, compound leaf, rosette leaf, cauline leaf, cigar leaf, embyro leaf, leaf spine, cotyledon, and flag leaf). To completely populate this branch of the ontology with leaf tip types would require the creation of 15 new terms (juvenile leaf tip, transition leaf tip, etc.), not to mention the terms for all of the other leaf parts, such as leaf lamina, leaf base, leaf epidermis, etc. Instead, column 16 can be used to post-compose any combination of is_a and part_of leaf that is needed. If it becomes clear that certain post-composed terms are being used extensively in annotations, pre-composed classes can be created.
+
* Any plant structure that can only be a part of one of the specific children will have a ''part_of'' relation only to that specific child (for example, costa part_of non-vascular leaf and leaf vascular system part_of vascular leaf), while plant structures that can be part of more than one type of the specific children will have a part_of relation to the more general term (for example, leaf tip part_of leaf).
 +
 
 +
* In order to prevent massive term inflation, PO curators prefer not to pre-compose all of the possible combinations of specific ''is_a'' and ''part_of'' children of more general term.  
 +
* For example, leaf has 15 is_a descendants (juvenile leaf, transition leaf, adult leaf, transition leaf, non-vascular leaf, vascular leaf, simple leaf, compound leaf, rosette leaf, cauline leaf, cigar leaf, embyro leaf, leaf spine, cotyledon, and flag leaf).  
 +
 
 +
* To completely populate this branch of the ontology with leaf tip types would require the creation of 15 new terms (juvenile leaf tip, transition leaf tip, etc.), not to mention the terms for all of the other leaf parts, such as leaf lamina, leaf base, leaf epidermis, etc.  
 +
 
 +
* Instead, column 16 can be used to post-compose any combination of ''is_a'' and ''part_of'' leaf that is needed.  
 +
 
 +
* If it becomes clear that certain post-composed terms are being used extensively in annotations, pre-composed classes can be created.
  
 
====Examples:====
 
====Examples:====
Line 27: Line 41:
  
 
===Plant structures that occur during a particular growth stage===
 
===Plant structures that occur during a particular growth stage===
 +
* Some plant structures occur only in a particular phase of a plant life cycle, or during a particular growth or development stage.
 +
** For example, the seta only occurs during the sporophytic phase of a moss's life cycle, while a gametophore only occurs during its gametophytic phase.
 +
* The PO uses "A participates_in B" to specify that '''every instance of structure A is present during some instance of plant growth and developmental stage B'''.
 +
 +
* Although this relation does not strictly mean that the structure A only occurs during the growth stage B, PO curators only add the relation if that is the case (''Note: we may want to consider a new relation, like "only_participates_in"?'').
 +
 +
* In cases such as this, when an annotator makes an association to the term, they are automatically making an association to the growth and developmental stage that the term participates in.
 +
 +
* Many plant structures occur during multiple plant growth and development stages, and an annotator may wish to specify the stage in which the experiment took place. If the PO were to create new terms for structure during every stage in which it can occur, it would lead to excessive term inflation.
 +
* Up to this point, annotators have simply made an association to both the structure and the stage that was used for the experiment. However, this does not provide an easy way of later linking the structural annotation to the temporal annotation. Such a link can easily be made using column 16 (see examples 1 and 2 below).
 +
 +
* The PO advocates using this practice to describe whole plants during different life cycle phases (see examples 3 and below). If a specific type of whole plant has some structural characteristics that distinguish it from other whole plants, and that type is being heavily used in annotations, PO curators may create a specific term that is a descendant of whole plant (e.g., PO:0025279 megagametophyte or PO:0009009 plant embryo).
 +
 +
====Examples====
 +
 +
1. If a gene product is localized in a leaf during senescence, the annotator would enter "PO:0009025" (leaf) in column 5 for POid, and "participates_in(PO:0001054)" (4 leaf senescence stage) in column 16 for the annotation extension.
 +
 +
1. If a gene product is localized in a fruit that has reached 10% of its final size, the annotator would enter "PO:0009001" (fruit) in column 5 for POid, and "participates_in(PO:0007032)" (FF.00 fruit size 10%) in column 16 for the annotation extension.
 +
 +
3. If an experiment describes the characteristics of a seedling, without more specific information on the parts of the seedling, the annotator would enter "PO:0000003" (whole plant) in column 5 for POid, and "participates_in(PO:xxxxxxx)" (seedling phase) in column 16 for the annotation extension.
  
Example 2: If a gene product is localized in a leaf during senesence, the PO ID (column 5) would be leaf (PO:0009025), and the annotation extension column would contain a cross-reference to participates_in leaf senescence stage (PO:0001054).
+
4. If an author wishes to annotate a manuscript that describes a gametophyte, s/he would  would enter "PO:0000003" (whole plant) in column 5 for POid, and "participates_in(PO:0028003)" (gametophyte phase) in column 16 for the annotation extension.
  
 
===Other types of relations to PO terms===
 
===Other types of relations to PO terms===
The use of relations other than part_of and participates_in for column 16 has not yet been approved. However, there may be times when annotators wish to use other relations to post-compose PO terms using column 16.
+
The use of relations other than ''part_of'' and ''participates_in'' for column 16 has not yet been approved. However, there may be times when annotators wish to use other relations to post-compose PO terms using column 16.
  
 
The other relations used in the PO are:
 
The other relations used in the PO are:
*has_part
+
* ''has_part''
*develops_from
+
* ''develops_from''
*derives_from
+
* ''derives_by_manipulation_from''
*adjacent_to
+
* ''adjacent_to''
 +
* ''located_in''
 +
* ''preceded_by''
 +
* ''precedes''
 +
 
 +
====The PO can only specify a relation between two classes when that relation is '''true for every instance in every taxon''' in which the two classes occur (the All-Some rule).====
 +
* A ''has_part'' B: only specified when ''every A has as a part some B''
 +
 
 +
* A ''develops_from'' B is only specified when ''every A develops from some B''.
  
The PO can only specify a relation between two classes when that relation is true for every instance in every taxon in which the two classes occur (the all-some rule). That is, A has_part Bis is only specified when every A has some B as part, and A develop_from B is only specified when every A develops from some B. Because of the diversity among different plant taxa, there are many cases where a relation is valid for some taxa but not others. Although the PO cannot specify these relations, they could be added as annotation extensions.  
+
* Because of the diversity among different plant taxa, there are many cases where a relation is valid for some taxa but not others. Although the PO cannot specify these relations, they could be added as annotation extensions.  
  
 
====Examples:====
 
====Examples:====
Line 48: Line 90:
 
2. Specifying spacial relations that may be present in the experiment being annotated, but that are not specified in the PO.  
 
2. Specifying spacial relations that may be present in the experiment being annotated, but that are not specified in the PO.  
  
For example, the PO specifies sporangium wall endothecium adjacent_to exothecium, because in every plant that has a sporangium wall endothecium, the endothecium is adjacent to (in permanent contact with) the exothecium. However, the PO does not specify exothecium adjacent_to sporangium wall endothecium, because there are plants that have an exothecium but no sporangium wall endothecium, and therefor the exothecium is not adjacent to the endothecium in those cases. If a user wanted to create an annotation for an experiment for a species that did have both an exothecium and a sporangium wall endothecium, and wanted to specify that a gene was expressed in an exothecium that was adjacent to a sporangium wall endothecium, s/he would enter "PO:0030073" (exothecium) in column 5 for the POid, and enter "adjacent_to(PO:0030049)" (sporangium wall endothecium) in column 16 for the annotation extension.
+
* Examples:
 +
** The PO specifies sporangium wall endothecium adjacent_to exothecium, because in every plant that has a sporangium wall endothecium, the endothecium is adjacent to (in permanent contact with) the exothecium.  
 +
** However, the PO does not specify exothecium adjacent_to sporangium wall endothecium, because there are plants that have an exothecium but no sporangium wall endothecium, and therefor the exothecium is not adjacent to the endothecium in those cases.  
 +
** If a user wanted to create an annotation for an experiment for a species that did have both an exothecium and a sporangium wall endothecium, and wanted to specify that a gene was expressed in an exothecium that was adjacent to a sporangium wall endothecium, s/he would enter "PO:0030073" (exothecium) in column 5 for the POid, and enter "adjacent_to(PO:0030049)" (sporangium wall endothecium) in column 16 for the annotation extension.
 +
 
 +
Note: This type of annotation extension is still under development. Users should contact the PO curators if they wish to use column 16 in this way.
  
 +
==Specific suggestions for column 16==
  
 +
We have collected a set of specific suggestions for terms and relations to put in column 16 in a
 +
[https://docs.google.com/spreadsheet/ccc?key=0AhrY0qRdO4budHRqaklJcmJxRjlaVnV6UGlyd0xWbHc&hl=en_US Google docs spread sheet].
  
 +
These suggestions are primarily for structures that are part of a more general type (e.g., leaf base part_of leaf) but could be part of a more specific subtype (e.g., leaf base could be part_of vascular leaf, non-vascular leaf, juvenile leaf, rosette leaf, etc.).
  
This type of annotaion extension is still under development, and users should contact the PO curators if they wish to use column 16 in this way.
+
This list is under development, and we welcome other suggestions.
 +
 
 +
==What NOT to put in column 16:==
 +
* Do not use column 16 to annotate a particular genetic stock- these should go into column 8: with(or)from:
 +
 
 +
For example, this is incorrect:
 +
 
 +
participates_in(PO:0001015)|participates_in(PO:0007130)|isolated_from_germplasm(Nipponbare)
  
 
==Annotation extensions to terms from other ontologies or databases==
 
==Annotation extensions to terms from other ontologies or databases==
Line 63: Line 121:
 
See the [[Annotation_Association_File_Format]] PO wiki page and the [http://www.geneontology.org/GO.format.gaf-2_0.shtml GAF 2.0 format] web page.
 
See the [[Annotation_Association_File_Format]] PO wiki page and the [http://www.geneontology.org/GO.format.gaf-2_0.shtml GAF 2.0 format] web page.
  
=Using column 16 to pass on annotations=
+
=Using column 16 to pass annotations from child to parent=
 +
 
 +
==For part_of and participates_in relations:==
 +
 
 +
* Due to the nature of the part_of and participates_in relations, any annotation that is associated with A, should also be associated with B, if A part_of B or A participates_in B. This holds true for both regular (pre-composed) relations and relations that are specified as annotation extensions.
 +
 
 +
* Annotations move automatically through pre-composed part_of and participates_in relations, but the current version of AmiGO (1.5) will not recognize any relations that are specified in column 16.
 +
 
 +
===For each line that contains a column 16 entry, the curator should manually create a new line that associates the annotation with the term in column 16===
 +
 
 +
===''has_part''===
 +
* When a ''has_part'' relation is pre-composed in the PO, it is often the case that the reciprocal part_of relation exists for at least some taxa. We have discussed various ways of properly moving annotations through ''has_part'' relations (from parent to child, and only in the case of appropriate taxa).
 +
 
 +
* Column 16 provides a way to deal with this, provided annotators can be trained to '''create the correct part_of relation in column 16'''.
 +
 
 +
* For example, all ''flowers'' in maize occur as part of an ''ear inflorescence''. Therefore, any gene expressed in a ''flower'' in maize should be also be annotated to ''ear inflorescence''.
 +
 
 +
* Since we don't have the relation ''flower'' ''part_of'' ''inflorescence'', this will not happen automatically.
 +
 
 +
==For other relation types==
 +
 
 +
'''Annotations should not pass from child to parent through adjacent_to, develops_from, or derives_from relations. Therefore, if any of these relations is specified in column 16, no further action should be taken.'''
 +
 
 +
=Technical issues=
 +
 
 +
GO specifies what types of Dbid's can go in column 16, but that include other GO id's, so clearly there is no strict restriction on using the same database as column 5.
 +
 
 +
In future versions of the AmiGO browser, it '''may''' automatically read the relation between column 5 (PO term) and column 16 in one row of an annotation file and create a new row that associates the annotation with the term in column 16.

Latest revision as of 20:23, 1 October 2015

Introduction

  • Each annotation line in the file pairs a single term from the PO to a single gene product or other data object. Using column 16 allows annotators to add additional information by combining additional terms in a single annotation. These terms can even come from other OBO ontologies, or they could be gene products.
  • The use of annotation extensions in the PO is based on the usage developed by the GO. A discussion of GO's annotation extension column can be found on the GO Annotation_Extension Page.
  • If and when an annotator chooses to use column 16, they are effectively creating an "on-the-fly" cross product term. - Need to clarify this...

Uses

  • There are many possible cases where one might wish to create annotation extensions. For now, the PO limits the use of annotation extensions to links to other PO terms and to a limited set of relations (see below).
  • Annotators wishing to create other types of extensions in column 16 should contact the POC curators.

Annotation extensions to other PO terms

  • Column 16 should be used when the appropriate part_of or participates_in relation is not specified, or cannot be inferred within the PO, for example, for creating specific part_of relations when only a general part_of relation exists, or for linking a structure to a specific growth or developmental stage.

1. Plant structures that are part_of another plant structure

  • The plant anatomical entity (PAE; PO:0025131) branch of the PO is structured with general categories of structures that are parents to more specific child terms.
  • Examples,
    • vascular leaf and non-vascular leaf are children of leaf
    • aerial tuber and subterranean tuber are children of tuber
  • Any plant structure that can only be a part of one of the specific children will have a part_of relation only to that specific child (for example, costa part_of non-vascular leaf and leaf vascular system part_of vascular leaf), while plant structures that can be part of more than one type of the specific children will have a part_of relation to the more general term (for example, leaf tip part_of leaf).
  • In order to prevent massive term inflation, PO curators prefer not to pre-compose all of the possible combinations of specific is_a and part_of children of more general term.
  • For example, leaf has 15 is_a descendants (juvenile leaf, transition leaf, adult leaf, transition leaf, non-vascular leaf, vascular leaf, simple leaf, compound leaf, rosette leaf, cauline leaf, cigar leaf, embyro leaf, leaf spine, cotyledon, and flag leaf).
  • To completely populate this branch of the ontology with leaf tip types would require the creation of 15 new terms (juvenile leaf tip, transition leaf tip, etc.), not to mention the terms for all of the other leaf parts, such as leaf lamina, leaf base, leaf epidermis, etc.
  • Instead, column 16 can be used to post-compose any combination of is_a and part_of leaf that is needed.
  • If it becomes clear that certain post-composed terms are being used extensively in annotations, pre-composed classes can be created.

Examples:

1. If a gene product is localized to the leaf tip of a vascular leaf, the annotator would enter "PO:0025142" (leaf tip) in column 5 for POid, and "part_of(PO:0009025)" (vascular leaf) in column 16 for the annotation extension.

2. If a gene product is localized in a callus parenchyma cell that is part of a wound response on a branch, the annotator would enter "PO:0025285" (callus parencyma cell) in column 5 for POid, and "part_of(PO:0025073)" (branch) in column 16 for the annotation extension. It might be possible to add an additional annotation extension to, e.g., participates_in(GO:0009611) (response to wounding).

Plant structures that occur during a particular growth stage

  • Some plant structures occur only in a particular phase of a plant life cycle, or during a particular growth or development stage.
    • For example, the seta only occurs during the sporophytic phase of a moss's life cycle, while a gametophore only occurs during its gametophytic phase.
  • The PO uses "A participates_in B" to specify that every instance of structure A is present during some instance of plant growth and developmental stage B.
  • Although this relation does not strictly mean that the structure A only occurs during the growth stage B, PO curators only add the relation if that is the case (Note: we may want to consider a new relation, like "only_participates_in"?).
  • In cases such as this, when an annotator makes an association to the term, they are automatically making an association to the growth and developmental stage that the term participates in.
  • Many plant structures occur during multiple plant growth and development stages, and an annotator may wish to specify the stage in which the experiment took place. If the PO were to create new terms for structure during every stage in which it can occur, it would lead to excessive term inflation.
  • Up to this point, annotators have simply made an association to both the structure and the stage that was used for the experiment. However, this does not provide an easy way of later linking the structural annotation to the temporal annotation. Such a link can easily be made using column 16 (see examples 1 and 2 below).
  • The PO advocates using this practice to describe whole plants during different life cycle phases (see examples 3 and below). If a specific type of whole plant has some structural characteristics that distinguish it from other whole plants, and that type is being heavily used in annotations, PO curators may create a specific term that is a descendant of whole plant (e.g., PO:0025279 megagametophyte or PO:0009009 plant embryo).

Examples

1. If a gene product is localized in a leaf during senescence, the annotator would enter "PO:0009025" (leaf) in column 5 for POid, and "participates_in(PO:0001054)" (4 leaf senescence stage) in column 16 for the annotation extension.

1. If a gene product is localized in a fruit that has reached 10% of its final size, the annotator would enter "PO:0009001" (fruit) in column 5 for POid, and "participates_in(PO:0007032)" (FF.00 fruit size 10%) in column 16 for the annotation extension.

3. If an experiment describes the characteristics of a seedling, without more specific information on the parts of the seedling, the annotator would enter "PO:0000003" (whole plant) in column 5 for POid, and "participates_in(PO:xxxxxxx)" (seedling phase) in column 16 for the annotation extension.

4. If an author wishes to annotate a manuscript that describes a gametophyte, s/he would would enter "PO:0000003" (whole plant) in column 5 for POid, and "participates_in(PO:0028003)" (gametophyte phase) in column 16 for the annotation extension.

Other types of relations to PO terms

The use of relations other than part_of and participates_in for column 16 has not yet been approved. However, there may be times when annotators wish to use other relations to post-compose PO terms using column 16.

The other relations used in the PO are:

  • has_part
  • develops_from
  • derives_by_manipulation_from
  • adjacent_to
  • located_in
  • preceded_by
  • precedes

The PO can only specify a relation between two classes when that relation is true for every instance in every taxon in which the two classes occur (the All-Some rule).

  • A has_part B: only specified when every A has as a part some B
  • A develops_from B is only specified when every A develops from some B.
  • Because of the diversity among different plant taxa, there are many cases where a relation is valid for some taxa but not others. Although the PO cannot specify these relations, they could be added as annotation extensions.

Examples:

1. Specifying a part_of relationships that is present in the experiment being annotated, but is only present as the reciprocal has_part relation in the PO.

For example, the PO specifies inflorescence has_part flower, because every inflorescence has at least one flower as a part. However, it does not specify flower part_of inflorescence, because there are many flowers in many species that are not part of an inflorescence. If a user wanted to create an annotation for an experiment where a gene was expressed in a flower that was part of an inflorescence, s/he would enter "PO:0009046" (flower) in column 5 for the POid, and enter "part_of(PO:0009049)" (inflorescence) in column 16 for the annotation extension.

2. Specifying spacial relations that may be present in the experiment being annotated, but that are not specified in the PO.

  • Examples:
    • The PO specifies sporangium wall endothecium adjacent_to exothecium, because in every plant that has a sporangium wall endothecium, the endothecium is adjacent to (in permanent contact with) the exothecium.
    • However, the PO does not specify exothecium adjacent_to sporangium wall endothecium, because there are plants that have an exothecium but no sporangium wall endothecium, and therefor the exothecium is not adjacent to the endothecium in those cases.
    • If a user wanted to create an annotation for an experiment for a species that did have both an exothecium and a sporangium wall endothecium, and wanted to specify that a gene was expressed in an exothecium that was adjacent to a sporangium wall endothecium, s/he would enter "PO:0030073" (exothecium) in column 5 for the POid, and enter "adjacent_to(PO:0030049)" (sporangium wall endothecium) in column 16 for the annotation extension.

Note: This type of annotation extension is still under development. Users should contact the PO curators if they wish to use column 16 in this way.

Specific suggestions for column 16

We have collected a set of specific suggestions for terms and relations to put in column 16 in a Google docs spread sheet.

These suggestions are primarily for structures that are part of a more general type (e.g., leaf base part_of leaf) but could be part of a more specific subtype (e.g., leaf base could be part_of vascular leaf, non-vascular leaf, juvenile leaf, rosette leaf, etc.).

This list is under development, and we welcome other suggestions.

What NOT to put in column 16:

  • Do not use column 16 to annotate a particular genetic stock- these should go into column 8: with(or)from:

For example, this is incorrect:

participates_in(PO:0001015)|participates_in(PO:0007130)|isolated_from_germplasm(Nipponbare)

Annotation extensions to terms from other ontologies or databases

Please contact the PO to discuss the type of cross product you wish to create, so that we can work out formatting specifications.

Usage specifications

See the Annotation_Association_File_Format PO wiki page and the GAF 2.0 format web page.

Using column 16 to pass annotations from child to parent

For part_of and participates_in relations:

  • Due to the nature of the part_of and participates_in relations, any annotation that is associated with A, should also be associated with B, if A part_of B or A participates_in B. This holds true for both regular (pre-composed) relations and relations that are specified as annotation extensions.
  • Annotations move automatically through pre-composed part_of and participates_in relations, but the current version of AmiGO (1.5) will not recognize any relations that are specified in column 16.

For each line that contains a column 16 entry, the curator should manually create a new line that associates the annotation with the term in column 16

has_part

  • When a has_part relation is pre-composed in the PO, it is often the case that the reciprocal part_of relation exists for at least some taxa. We have discussed various ways of properly moving annotations through has_part relations (from parent to child, and only in the case of appropriate taxa).
  • Column 16 provides a way to deal with this, provided annotators can be trained to create the correct part_of relation in column 16.
  • For example, all flowers in maize occur as part of an ear inflorescence. Therefore, any gene expressed in a flower in maize should be also be annotated to ear inflorescence.
  • Since we don't have the relation flower part_of inflorescence, this will not happen automatically.

For other relation types

Annotations should not pass from child to parent through adjacent_to, develops_from, or derives_from relations. Therefore, if any of these relations is specified in column 16, no further action should be taken.

Technical issues

GO specifies what types of Dbid's can go in column 16, but that include other GO id's, so clearly there is no strict restriction on using the same database as column 5.

In future versions of the AmiGO browser, it may automatically read the relation between column 5 (PO term) and column 16 in one row of an annotation file and create a new row that associates the annotation with the term in column 16.