Difference between revisions of "PO PlantSystematics conference call 10-21-11"

From Plant Ontology Wiki
Jump to navigationJump to search
Line 1: Line 1:
 
Meeting to discuss technical issues of linking between PO and PS.org.
 
Meeting to discuss technical issues of linking between PO and PS.org.
  
JP: no way of automatically putting in links to keywords automatically
+
Present: Kevin Nixon, Pankaj Jaiswal, Justin Elser, Justin Preece, Ramona Walls
  
PJ: if we had a list of keywords, we can create a mapping and just maintain that
 
  
Mapping should be hosted on PO SVN site. We have a mappings folder already.
 
  
Kevin can have access to this
+
JP: Currently no way of automatically putting in links to keywords -- means manual curation, which is time consuming and can lead to errors.
  
If an image on PS.o has an association to PO, we can have an automatic association
+
PJ: if we had a list of keywords, we could create a mapping and just maintain that, then use that mapping to create links both from PO to PS.org and from PS.org to PO
  
Mapping: connect to 1000 image for petal. Don't want to create xrefs for all of them.
+
Mapping could be hosted on PO SVN site. We have a mappings folder already. SN can have access to this.
  
Contributors don't always put in keywords.
+
If an image on PS.org has an association to PO, we can could use that to automaticly create associations.
could have students etc. upgrade that
 
  
KN create a table of keywords and ID, we keep them updated with our list of ids
+
Advantage of mapping/using keywords rather than links to individual images: Don't want to create xrefs for keywords (like flower) that have hundreds of images.
 +
 
 +
Potential problem is false positives, if for example keyword is used incorrectly or doesn't exactly match meaning of PO definition.
 +
 
 +
KN: Contributors don't always put in keywords. This could be fixed by having students or other upgrade PO.org by adding keywords.
  
Can't do boolean search, but we could send him list of words which he could then split out and turn into wildcards to separate terms. No need for any additional association file or mappings.
+
PS.org presently can't do boolean search. PO could input a list of words which PS.org could then split out into separate terms, and add wildcards. No need for any additional association file or mappings.
  
KN could add flag to images. We could click on which images we want.
+
KN could add flags to images that match keywords, then we could click on which images we want to keep -- to avoid false positives.
  
We have two needs: 1. link to images from PO 2. link all PSO images to PO IDs.
+
KN create a table of keywords and ID, we keep them updated with our list of ids
  
We are trying to encourage people to use standard vocabulary.
+
PO is trying to encourage people to use standard vocabulary -- this is why we want to have links to PO terms/IDs.
  
 
Could begin mappings, but still have fuzzy search that will complement this.
 
Could begin mappings, but still have fuzzy search that will complement this.
  
PJ: we are working on software to build sectors on images and automatically identify what is on image based on a library.
+
PJ: (side note) working on software (with collaborator at OSU) to build sectors on images and automatically identify what is on image based on a library.
 +
 
 +
We have two needs: 1. link to images from PO 2. link all PSO images to PO IDs.
  
 
KN: First step is to implement a search that is sufficient given existing data structure, then move toward system that integrates PO by downloading a table from PO that integrates with his terms.
 
KN: First step is to implement a search that is sufficient given existing data structure, then move toward system that integrates PO by downloading a table from PO that integrates with his terms.
Line 34: Line 37:
 
KN can start by sending us a list of his keywords. But many of them won't be relevant for PO.
 
KN can start by sending us a list of his keywords. But many of them won't be relevant for PO.
  
Need a mechanism in place so that additions and changes are updated automatically as much as possible.
+
PO will use this to create a mapping file.
  
RW and KN can start with only terms associated with ie flower and leaf to see how far we can extend from there.
+
Need to have a mechanism in place so that additions and changes are updated automatically as much as possible.
 +
 
 +
One option (to start smaller) is to start with only terms associated with e.g. flower and leaf to see how far we can extend from there -- images that have keyword flower and any keyword associated with it.
  
 
Start by getting keywords from KN and doing a candidate mapping.
 
Start by getting keywords from KN and doing a candidate mapping.
Line 42: Line 47:
 
Standard format of mapping file is on SVN -- we will use same format.
 
Standard format of mapping file is on SVN -- we will use same format.
  
We are using UTF8 (not unicode) - current AmiGO is UTF, but we may go to unicode
+
PO is using UTF8 (not unicode) because AmiGO requires UTF, but we may go to unicode in the future.
  
FAO is translating ontology into different languages.
+
KN: IP for pages on PS.org will be moving, so we want to stick with www.plantsystematics domain - going onto gigabit server.
  
IP will be moving, so we want to stick with www.plantsystematics domain - going onto gigabit
 
 
Could start with all images that have keyword flower and any keyword associated with it.
 
 
Because he is using frames will have to something
 
  
 
KN: list of keywords, work on search function and give us a different url for it.
 
KN: list of keywords, work on search function and give us a different url for it.
  
 
+
We will input a list of words, delimited however we want (& or whatever) and KN will write a query that takes those and does a fuzzy search on them then create the link to the correct search results page (he has something like this he has already used for taxonomic names - works with missing letter and internal errors - can transfer this over to keyword search, even though he hasn't done it yet)
We will send a list of words, delimited however we want (& or whatever) and KN will write a query that takes those and do a fuzzy search on them then (he has something like this he has already used for taxonomic names - works with missing letter and internal errors - can transfer this over to keyword search, even though he hasn't done it yet)
 
  
 
Will need to figure out what delimiter to use to separate words: e.g., stamen&anther&filament
 
Will need to figure out what delimiter to use to separate words: e.g., stamen&anther&filament
  
May want to have something like po_word:something;taxon:something
+
May want to have something like po_word:
  
We can always add other variables later when want.
+
Eventually, we could add other variables, like: po_word:something;taxon:something.
  
May need to think about window dimension later, if we can incorporate a popup from PO site.
+
May also need to think about window dimension later, if we decide to incorporate a pop-up from PO site.

Revision as of 21:50, 21 October 2011

Meeting to discuss technical issues of linking between PO and PS.org.

Present: Kevin Nixon, Pankaj Jaiswal, Justin Elser, Justin Preece, Ramona Walls


JP: Currently no way of automatically putting in links to keywords -- means manual curation, which is time consuming and can lead to errors.

PJ: if we had a list of keywords, we could create a mapping and just maintain that, then use that mapping to create links both from PO to PS.org and from PS.org to PO

Mapping could be hosted on PO SVN site. We have a mappings folder already. SN can have access to this.

If an image on PS.org has an association to PO, we can could use that to automaticly create associations.

Advantage of mapping/using keywords rather than links to individual images: Don't want to create xrefs for keywords (like flower) that have hundreds of images.

Potential problem is false positives, if for example keyword is used incorrectly or doesn't exactly match meaning of PO definition.

KN: Contributors don't always put in keywords. This could be fixed by having students or other upgrade PO.org by adding keywords.

PS.org presently can't do boolean search. PO could input a list of words which PS.org could then split out into separate terms, and add wildcards. No need for any additional association file or mappings.

KN could add flags to images that match keywords, then we could click on which images we want to keep -- to avoid false positives.

KN create a table of keywords and ID, we keep them updated with our list of ids

PO is trying to encourage people to use standard vocabulary -- this is why we want to have links to PO terms/IDs.

Could begin mappings, but still have fuzzy search that will complement this.

PJ: (side note) working on software (with collaborator at OSU) to build sectors on images and automatically identify what is on image based on a library.

We have two needs: 1. link to images from PO 2. link all PSO images to PO IDs.

KN: First step is to implement a search that is sufficient given existing data structure, then move toward system that integrates PO by downloading a table from PO that integrates with his terms.

KN can start by sending us a list of his keywords. But many of them won't be relevant for PO.

PO will use this to create a mapping file.

Need to have a mechanism in place so that additions and changes are updated automatically as much as possible.

One option (to start smaller) is to start with only terms associated with e.g. flower and leaf to see how far we can extend from there -- images that have keyword flower and any keyword associated with it.

Start by getting keywords from KN and doing a candidate mapping.

Standard format of mapping file is on SVN -- we will use same format.

PO is using UTF8 (not unicode) because AmiGO requires UTF, but we may go to unicode in the future.

KN: IP for pages on PS.org will be moving, so we want to stick with www.plantsystematics domain - going onto gigabit server.


KN: list of keywords, work on search function and give us a different url for it.

We will input a list of words, delimited however we want (& or whatever) and KN will write a query that takes those and does a fuzzy search on them then create the link to the correct search results page (he has something like this he has already used for taxonomic names - works with missing letter and internal errors - can transfer this over to keyword search, even though he hasn't done it yet)

Will need to figure out what delimiter to use to separate words: e.g., stamen&anther&filament

May want to have something like po_word:

Eventually, we could add other variables, like: po_word:something;taxon:something.

May also need to think about window dimension later, if we decide to incorporate a pop-up from PO site.