Difference between revisions of "PO-FNA mapping results"
Line 82: | Line 82: | ||
-discuss how to map terms that are too general for PO (101 character terms) | -discuss how to map terms that are too general for PO (101 character terms) | ||
*begin work on phenotype/character terms, inlcuding the 101 from this list plus all of the FNA character terms | *begin work on phenotype/character terms, inlcuding the 101 from this list plus all of the FNA character terms | ||
+ | |||
+ | |||
+ | For a list of new syonyms for existing terms, see: | ||
+ | https://docs.google.com/spreadsheet/ccc?key=0AhrY0qRdO4budEtJQUc3VDZtZllyVXRWYlFTQXBJMFE&hl=en_US#gid=0 | ||
+ | |||
+ | These wil be added to the PO file with FNA:glossary as the reference. Once | ||
+ | FNA creates unique IDs for its terms, we can reference them as FNA:id#. |
Latest revision as of 19:58, 5 January 2012
This includes the run on 07/18/2011 by C. Mungall using the OBOL program. When the software returned multiple matches, RW chose the correct match, or used both if appropriate. Other matches added manually by RW.
Exact duplicates (same term, category, and limitation) represent multiple defintions for the same term in FNA glossary (multiple concepts with the same name). RW added numbers to correspond to FNA numbers.
Definitions from PO were checked against definitions from the FNA glossary at http://128.2.21.109/fmi/xsl/FNA/home.xsl
NOTE: OBOL will match to obsolete terms, with no warning.
Number of FNA terms: | 839 | including duplicates with different meanings |
Number of PO anatomy terms: | 1080 | release 16 |
Number of matches (including auto-matches; all manually checked): | 930 | including duplicates with different meanings, plus those with the same meaning that map to multiple PO terms |
Number of matches made by software: | 264 | Matches to both term names and syonyms. This is high, because many of these were not accurate (OBOL matched all concepts with the same name in FNA to the same PO term) |
Number of multiple matches made by software: | 49 | Automatched to more than one term. 26 of the multiples are because the Spanish and English names are the same |
Total number of one to one matches to existing terms: | 126 | includes one to one matches that are slight variants on name (e.g., FNA:fascicle to PO:flower fascicle, FNA:cell to PO:plant cell) |
Total number of matches to existing synonyms: | 193 | Includes 3 obsolete terms |
Total number FNA terms that map to >1 PO term | 14 | |
Total number of matches to obsolete terms: | 8 | Inlcudes duplicate plural forms, 5 unique. 3 have been replaced by GO terms, other 5 have no explanation as to why obsolete. |
Total number of new synonyms needed for existing terms: | 364 | includes plural forms, so many of these will go to the same PO term |
Total number of unique new terms needed: | 143 | if all are approved |
Total number of FNA terms that should be added as characters: | 101 | Too general for PO structures, better scored as phenotypes. Includes FNA terms like blister or scallop. |
FNA terms that could not be mapped to PO: | 17 | actually, some of these might be plant substances (e.g., raphide) |
~belong in GO: | 5 | |
~too vague for PO structures: | 12 | some of these could be scored as phenotypes |
Next steps:
- add 364 synonyms to existing terms
- fix several errors that were discovered while doing the mappings
- add 143 unique new terms, plus their synonyms
-FNA provides definitions, so this will be relatively easy
- work with FNA to create an official mapping file
-will need to get unique IDs for duplicate FNA terms
-will need to figure out how to handle FNA terms that map to >1 PO term
-discuss how to map terms that are too general for PO (101 character terms)
- begin work on phenotype/character terms, inlcuding the 101 from this list plus all of the FNA character terms
For a list of new syonyms for existing terms, see:
https://docs.google.com/spreadsheet/ccc?key=0AhrY0qRdO4budEtJQUc3VDZtZllyVXRWYlFTQXBJMFE&hl=en_US#gid=0
These wil be added to the PO file with FNA:glossary as the reference. Once FNA creates unique IDs for its terms, we can reference them as FNA:id#.