List of useful Datasets
Go back to TIPS paper page
Genevestigator
Organisms: High quality sources only:
Arabidopsis thaliana source expts:
- AG 8K array, 90 expts
NASC (72), GEO (18)
- ATH1: 22K array: 5760 expts
-Gruissem lab (18), NSAC (326), FGCZ (55), GEO (2731), ArrayExpress (1679), AtGenExpress (876), TAIR (38), other (37)
Oryza sativa source expts:
- Os_51K Rice Genome Array 51K: 162 expts:
GEO (151), PlexDB (11)
Rice Gene Expression Atlas (Wang et al, 2010)
-contains 57,381 probe sets in total, of which 1347 represent 1260 indica transcripts and 54 168 represent 48 564 japonica transcripts.
- 31 tissues/organs from throughout the plant life cycle,from two indica varieties; Zhenshan 97 and Minghui 63.
- 8 (callus) samples spanning the tissue culture process of genetic transformation were also included,
- Total number of tissues were 39- this is actually only 15 different tissues/organs, but some were taken at different developmental stages ( ).
-germinating seed
-plumule (2)
-radicle (2)
-seedling (5)
-leaf sheath (2)
-stem (2)
-leaf (4)
-palea + lemma
-stamen
-spikelet
-endosperm (3)
-panicle (5)
-root
-shoot
-Callus (8)
Data was deposited in the database CREP Collections of Rice Expression Profiling.
-It can be searched by querying single gene or multiple genes using gene sequences, gene names, or the probe identifications, for items like quantified expression values, the co-expressed genes, the tissues and varieties in which they are expressed.
gymnosperm data sets
- EST sequences for Gingko biloba and Cycas rumphii: http://www.ncbi.nlm.nih.gov/sra/SRP002638
May not be that useful; just contains the sequences.
- Valledor et al. 2010 Combined Proteomic and Transcriptomic Analysis Identifies Differentially Expressed Pathways Associated to Pinus radiata Needle Maturation. JOURNAL OF PROTEOME RESEARCH 9(8):3954-3979 (DOI: 10.1021/pr1001669)
Other sources of micorarray data:
UC Davis Ricearray.org
Rice Array Database (RAD) is a public resource for gene expression analysis in rice, the most important staple food over the world. Over 1,000 microarray slides were incorporated and several kinds of analysis tool are available in RAD.
GEO Gene Expression Omnibus (NCBI)
A public functional genomics data repository supporting MIAME-compliant data submissions. Array- and sequence-based data are accepted. Tools are provided to help users query and download experiments and curated gene expression profiles.
CREP Database NRCP China
(Wang et al, 2010) Collections of Rice Expression Profiling
PlexDB Plant Expression Database
PLEXdb (Plant Expression Database) is a unified public resource for gene expression for plants and plant pathogens. PLEXdb serves as a bridge to integrate new and rapidly expanding gene expression profile data sets with traditional structural genomics and phenotypic data. The integrated tools of PLEXdb allow investigators to use commonalities in plant biology for a comparative approach to functional genomics through use of large-scale expression profiling data sets.
EBI ArrayExpress Archive- a database of functional genomics experiments including gene expression
- VirtualPlant integrates genomic data and provides visualization and analysis tools for rapid and efficient exploration of genomic data. The goal of VirtualPlant is to provide the tools to aid researchers to generate biological hypotheses.
List of microarray data sources on TAIR: http://www.arabidopsis.org/portals/expression/microarray/microarrayDatasetsV2.jsp
PlantTribes
PlantTribes is an objective classification system for plan proteins based on cluster analyses of the inferred proteomes of the sequenced angiosperms Arabidopsis thaliana v Columbia, Oryza sativa v. japonica (Rice), and Populus trichocarpa (poplar). Sequence data for Carica papaya (papaya v. 1.0; R. Ming et al. in preparation) and Medicago papaya (barrel medic, 60% complete) are also included in the current version of Tribes (v. 1.0).
PlantTribes 1.0 incorporates an extensive collection of microarray expression data from Arabidopsis microarray experiments [link to the doc page]. Expression data is linked to the individual genes in PlantTribes, and can be accessed through any result including Arabidopsis gene sequences.
CIBEX
CIBEX is a public database for microarray data, which is aimed at storing MIAME-compliant data in accordance with MGED Society recommendations.
NASCArrays is the Nottingham Arabidopsis Stock Centre's microarray database. Currently we hybridise over 1000 GeneChips per year mostly for Arabidopsis thaliana experiments which we make public, more than from any other GeneChip centre. There are also experiments from other species, and experiments run by other centres.
Need to find datasets for Gymnosperms/lower plants/ non-model-organisms