Flora of North America- FNA

From Plant Ontology Wiki
Revision as of 16:34, 23 June 2011 by Cooperl (talk | contribs)
Jump to navigationJump to search

PO-FNA webex meeting June 21st 2011

In attendance:

POC members: Laurel Cooper (OSU), Pankaj Jaiswal (OSU), Ramona Walls (NYBG).

Collaborators: James Macklin (institution; Flora of North America; james.macklin@gmail.com) and Hong Cui (University of Arizona; hong1.cui@gmail.com)

http://www.efloras.org/flora_page.aspx?flora_id=1


Purpose of meeting: To establish collaborative relationship with PO/TO so that the extracted terms can be added to the ontologies. They will use the PO?TO ID in their annotation

Their group is interested in text mining applications using the PO and TO, as well as other ontologies.

Two applications: Flora of North America and a new project, funding is is being applied for

Current project @ parsing FNA is in the 3rd year, ends next year

New ABI(?) proposal being prepared: Parsing literature descriptions for a plant family and bee(?) family,

This project will be more difficult, will specimen descriptions, predicted to be messier than describing genus and species.

New project: Parsing Taxonomic Concept Analysis:

Project goal: Robustly produce the software they have developed on the FNA project and extend it to a new application- Taxonomic Concept Analysis

Taxonomic Concept: eg. published fact such as accepted plant name and all the synonyms. (simplest case), essentially a checklist

-difficult for users to make sense of all the names, etc

-Want to move from names to "character spaces"; parsed descriptions will describe what is held by the name

-Characters and attributes, convert words back into matrices, use that to do analyses: logic-based and entropy-based ("information gain")

Assess: What is the character space held by this name?

Then: Compare character spaces- more quantitative and interesting


RW: discrete characters or continuous? JM: Want to encompass them all: Classical taxonomy has cont. characters have been converted to discrete characters. Methods will allow for polymorphism, discrete and continuous characters. Hardest part is defining terms and what those terms hold

8:30 Hong: lets discuss specifics: How and what format do the terms need to be in? To help the PO quickly evaluate and add them and give us the PO#.

How can we let the software know that some term will not be there in the PO/TO? Then they will have to figure out some other way for the SW to deal with them.


HC: List of terms from the literature (plant structures), with PO ids. Collaborator- (Bob Morris?) and or his student Looked through the PO and many were not there.

PJ: Does the FNA assign the terms a unique ID #? Terms go into a relational table

PJ: Can develop mapping file,to match our terms with theirs, conventions exist see: link to SVN site


PJ- Need to look over list and determine if they are 'valid' plant structure vs phenotype terms

The FNA terms may match a PO term name, synonym, may be added as a synonym or may be considered for addition as a new term

HC: We have already extracted the plant structure terms from the phenotype terms

PJ: simplest first step, to examine the list/vocabulary which has been curated

JM: Have a list or glossary of terms, of these, about 30% have definitions.

18:30


HC: During the project, we will be discovering new terms on a daily basis- is it better to save and send them as a batch or individually? PO: Either is fine, good if we have background info on the terms, taxonomic characters, literature citations etc. Can you provide an examples of how it is used (ie: in a sentence)?

PJ: Character list: Can these go into the TO? PATO may be to general, they may not want to want ot add all our specific terns- may be useful for the upper levels though.

PO may consider a new ontology class- Plant Phenotype Ontology?

The PO will reference the ontology terms based to the FNA site through links, and provide examples of how it is used FNA could be added as a dbxref in the ontology and could also create a subset similar to the one for traitnet