Initial Meeting- Aug 2011
Notes about the dataset:
From Mary Schaeffer by email:
- These are all new annotations – and there are a lot, as each gene-model has some expression. I am lumping the putative splice variants to one model.
- For our 60 tissues for this set, the number of PO terms is some 52 distinct ones.
- one way to reduce the data size: look only at gene models that are not expressed in all tissues – this will reduce by some 50% but it is still a big dataset.
Here is the table, pulled from MaizeGDB using a curation interface. The comments column includes both the PO terms and some other stuff. I am still reviewing this a bit, and see that I missed one, but this should be adequate to get some inputs from you on what associations you would like to receive from us.
The first column is the tissue name in MaizeGDB, the second a description provided by the researchers; the third, the related maize-specific term; the last the other comments, mostly the string of PO accessions and term names used. These later are also parsed into a separate extDB table so we can link to PO.org.
- General comments about the annotation process:
First, I linked each tissues to the maize specific terms, which in turn are linked to PO, using PO definitions, and in some cases requesting tweaks from your group in the definitions to make a better fit to a generic plant.
Then, I used these PO terms as a starting point, with extensive editing and review, that included checking all terms vs definitions, especially the temporal ones, and with some requests to the PO re tweaking definitions to fit better a generic case. Note, where the maize term silk, in MaizeGDB is linked to all the terms for silk at PO, I will only use the Zea one for the association file (even thought this seems smelly).
- General Note: The stages of maize growth used by Kaeppler group are those from the extension booklet: Ritchie et al ‘How a Corn Plant Develops’ reprinted 2010. The stages used by MaizeGDB are from Abbe & Stein, with synonyms to the Ritchie staging, based on work by Pat Byrne at MaizeGDB in the early-mid 90’s. Kiesselbach was often used here, along with general Esau Plant Anatomy (the full version, not the recent watered-down edition). The images were useful for general staging, and I often asked one of the research dudes re kernel stages, vs the #days after pollination they show. They also have embryos in the freezer, and I have inquired about getting better staging on the embryos that days after pollination. In general, their staging was close to that for Iowa booklet, but a bit earlier, by ~2 days.
- Specific note: The leaf number stage in maize is when the leaf is fully extended. Typically about 2 more leaves are visible at this time, without pulling apart the plant) So when it says V3, this matches the 5 leaf visible stage in PO, etc.
- Note, the links to PO from each of the Atlas tissues are mostly not yet in production; most of my finalizing of the annotations was done after July 1.
Questions
from MS by email:
- In a few cases, there is a classical gene name for the gene model. I assume these could be supplied as synonyms? Or, would you prefer they be supplied as a separate row?
Need to look at this, and ask PJ
- Do you still wish to have separate files for anatomy and growth terms?
I think that might be a good idea as well to make the huge file easier to deal with.
- Note, the instructions on the wiki for field 13. TAXON deal with Field 12 and should be altered.
I am not sure I understand to which page you are referring. The info shown here: http://wiki.plantontology.org:8080/index.php/Annotation_File_Format looks like it matches the GO page (http://www.geneontology.org/GO.format.gaf-2_0.shtml), as it should. Could you please send the link?
- Question about column 16 – annotation extension, in the PO associations file, could this be used for for a staging based on numbers of leaves, that corresponds to other stages in maize?
Example: “leaf tip expanding V7 B73” corresponds to maize growth stage: “2 tassel initiation/early whorl stage”.
Should a term for tassel initiation be added here, for the association of “PO:0007063 LP.07 7 leaves visible”?
Issues and concerns
- From the POC conference call 8-2-11:
-use of column 16 to designate the different stage descriptions in different sources
- Documentation of the statistical analysis and cut-off used for the microarray data- is this published yet?
Plan of action:
- Mary will work with JE to get SVN access set up, done
- PO will review the mappings between the maize samples (60) and the PO terms (~52). ''MS sent us the mappings as a spreadsheet and we discussed it on the POC conference call 8-2-11.
- Do we need to add or modify any existing PO terms? Are we going to proceed with getting rid of the Zea "sensu" terms?
- Mary will upload a small file first, (perhaps the annotations to the structure terms first?) and then upload the larger file.