Investigating trait ontologies to facilitate integrating phenotype and genome sequence level information in wheat
Rudi Appels
Centre for Comparative Genomics, Murdoch University, Perth, WA 6150, Western Australia.
(61) 8 9360 6088
rudiappels5@gmail.com.
Commercially important crops such as wheat (Triticum aestivum L) have an extensive history of data collection, relating phenotype to genetic information. The catalogue of gene symbols for wheat (http://www.shigen.nig.ac.jp/wheat/komugi/genes/symbolClassList.jsp, edited by R.A. McIntosh), provides a well- (meaning manually) curated assessment of the genetic studies carried out to map traits of interest to genetic maps. These traits are assigned symbols, and an example of an entry is:
Pre-harvest sprouting
QTL: Several QTL for falling number and alpha-amylase activity, two indicators for pre-harvest sprouting resistance, were identified in {0169}. The most significant were associated with Xglk699-2A and Xsfr4(NBS)-2A, Xglk80-3A and Xpsr1054-3A, Xpsr1194-5A and Xpsr918-5A, Xpsr644-5A and Xpsr945-5A, Xpsr8(Cxp3)-6A and Xpsr563-6A, and Xpsr350-7B and Xbzh232(Tha)-7B {0169}.
Typically, a trait such as pre-harvest sprouting will have several synonyms, such as sprouting index (QSi.crc-5D), dormancy (Q.SD1), or pre-harvest sprouting (Qphs.ocs-3A.2), which need to be captured for an effective analysis of published information.
In order to cross-reference genome sequence information to trait/phenotype studies, the available information in wheat is now cross-referenced to detailed consensus molecular genetic maps (http://ccg.murdoch.edu.au/cmap/ccg-live/cgi-bin/cmap/viewer) in the CMap software utilized by GRAMENE (http://www.gramene.org/). In this presentation, the experience of developing trait ontologies for wheat, using the information established for rice, will be discussed in the context of relating the CMap-based information that locates published QTL for traits to the genome sequence of wheat. The sequence of a complex genome such as wheat has been challenging to develop, but several new technologies are now converging to establish draft sequence-level definitions of the gene-space.