Difference between revisions of "PO PlantSystematics conference call 10-21-11"

From Plant Ontology Wiki
Jump to navigationJump to search
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
 
Meeting to discuss technical issues of linking between PO and PS.org.
 
Meeting to discuss technical issues of linking between PO and PS.org.
  
JP: no way of automatically putting in links to keywords automatically
+
Present: Kevin Nixon, Pankaj Jaiswal, Justin Elser, Justin Preece, Ramona Walls
  
PJ: if we had a list of keywords, we can create a mapping and just maintain that
+
==notes from meeting==
  
Mapping should be hosted on PO SVN site. We have a mappings folder already.
+
JP: Currently no way of automatically putting in links to keywords -- means manual curation, which is time consuming and can lead to errors.
  
Kevin can have access to this
+
PJ: if we had a list of keywords, we could create a mapping and just maintain that, then use that mapping to create links both from PO to PS.org and from PS.org to PO
  
If an image on PS.o has an association to PO, we can have an automatic association
+
Mapping could be hosted on PO SVN site. We have a mappings folder already. SN can have access to this.
  
Mapping: connect to 1000 image for petal. Don't want to create xrefs for all of them.
+
If an image on PS.org has an association to PO, we can could use that to automaticly create associations.
  
Contributors don't always put in keywords.
+
Advantage of mapping/using keywords rather than links to individual images: Don't want to create xrefs for keywords (like flower) that have hundreds of images.  
could have students etc. upgrade that
 
  
KN create a table of keywords and ID, we keep them updated with our list of ids
+
Potential problem is false positives, if for example keyword is used incorrectly or doesn't exactly match meaning of PO definition.
 +
 
 +
KN: Contributors don't always put in keywords. This could be fixed by having students or other upgrade PO.org by adding keywords.
  
Can't do boolean search, but we could send him list of words which he could then split out and turn into wildcards to separate terms. No need for any additional association file or mappings.
+
PS.org presently can't do boolean search. PO could input a list of words which PS.org could then split out into separate terms, and add wildcards. No need for any additional association file or mappings.
  
KN could add flag to images. We could click on which images we want.
+
KN could add flags to images that match keywords, then we could click on which images we want to keep -- to avoid false positives.
  
We have two needs: 1. link to images from PO 2. link all PSO images to PO IDs.
+
KN create a table of keywords and ID, we keep them updated with our list of ids
  
We are trying to encourage people to use standard vocabulary.
+
PO is trying to encourage people to use standard vocabulary -- this is why we want to have links to PO terms/IDs.
  
 
Could begin mappings, but still have fuzzy search that will complement this.
 
Could begin mappings, but still have fuzzy search that will complement this.
  
PJ: we are working on software to build sectors on images and automatically identify what is on image based on a library.
+
PJ: (side note) working on software (with collaborator at OSU) to build sectors on images and automatically identify what is on image based on a library.
 +
 
 +
'''We have two needs: 1. link to PS.org images from PO 2. link PS.org images to PO IDs.'''
  
 
KN: First step is to implement a search that is sufficient given existing data structure, then move toward system that integrates PO by downloading a table from PO that integrates with his terms.
 
KN: First step is to implement a search that is sufficient given existing data structure, then move toward system that integrates PO by downloading a table from PO that integrates with his terms.
Line 34: Line 37:
 
KN can start by sending us a list of his keywords. But many of them won't be relevant for PO.
 
KN can start by sending us a list of his keywords. But many of them won't be relevant for PO.
  
Need a mechanism in place so that additions and changes are updated automatically as much as possible.
+
PO will use this to create a mapping file.
  
RW and KN can start with only terms associated with ie flower and leaf to see how far we can extend from there.
+
Need to have a mechanism in place so that additions and changes are updated automatically as much as possible.
 +
 
 +
One option (to start smaller) is to start with only terms associated with e.g. flower and leaf to see how far we can extend from there -- images that have keyword flower and any keyword associated with it.
  
 
Start by getting keywords from KN and doing a candidate mapping.
 
Start by getting keywords from KN and doing a candidate mapping.
Line 42: Line 47:
 
Standard format of mapping file is on SVN -- we will use same format.
 
Standard format of mapping file is on SVN -- we will use same format.
  
We are using UTF8 (not unicode) - current AmiGO is UTF, but we may go to unicode
+
PO is using UTF8 (not unicode) because AmiGO requires UTF, but we may go to unicode in the future.
  
FAO is translating ontology into different languages.
+
KN: IP for pages on PS.org will be moving, so we want to stick with www.plantsystematics domain - going onto gigabit server.
  
IP will be moving, so we want to stick with www.plantsystematics domain - going onto gigabit
 
  
Could start with all images that have keyword flower and any keyword associated with it.
+
KN: list of keywords, work on search function and give us a different url for it.
  
Because he is using frames will have to something
+
We will input a list of words, delimited however we want (& or whatever) and KN will write a query that takes those and does a fuzzy search on them then create the link to the correct search results page (he has something like this he has already used for taxonomic names - works with missing letter and internal errors - can transfer this over to keyword search, even though he hasn't done it yet)
  
KN: list of keywords, work on search function and give us a different url for it.
+
Will need to figure out what delimiter to use to separate words: e.g., stamen&anther&filament
 +
 
 +
May want to have something like po_word:
 +
 
 +
Eventually, we could add other variables, like: po_word:something;taxon:something.
 +
 
 +
May also need to think about window dimension later, if we decide to incorporate a pop-up from PO site.
 +
 
 +
JE: important to test out the links first on our browser, to make sure they work, before we implement this on a large scale.
  
 +
==tasks==
 +
===short term:===
 +
*KN will send list of all PS.org keywords to RW at PO.
  
We will send a list of words, delimited however we want (& or whatever) and KN will write a query that takes those and do a fuzzy search on them then (he has something like this he has already used for taxonomic names - works with missing letter and internal errors - can transfer this over to keyword search, even though he hasn't done it yet)
+
*RW will work with JP to create a mapping file between PS.org keywords and PO IDs. This will go on PO's SVN and will follow standard mapping file format.
  
Will need to figure out what delimiter to use to separate words: e.g., stamen&anther&filament
+
*JP, JE and RW will work create a few test cases of keyword searches (e.g., some with a single word, some with multiple words) and send these to KN so he can figure out the best format for the url.
  
May want to have something like po_word:something;taxon:something
+
*JE will create the links from pages on the dev version of PO's AmiGO brower, to make sure they work properly.
  
We can always add other variables later when want.
+
===longer term===
 +
*PO will work on method of automatically adding links to images from PO based on PO term names and synonyms.
  
May need to think about window dimension later, if we can incorporate a popup from PO site.
+
*PO will work with KN to devise a strategy for associating PO id's directly with PO.org images. These can then be used to link back to PO.

Latest revision as of 21:59, 21 October 2011

Meeting to discuss technical issues of linking between PO and PS.org.

Present: Kevin Nixon, Pankaj Jaiswal, Justin Elser, Justin Preece, Ramona Walls

notes from meeting

JP: Currently no way of automatically putting in links to keywords -- means manual curation, which is time consuming and can lead to errors.

PJ: if we had a list of keywords, we could create a mapping and just maintain that, then use that mapping to create links both from PO to PS.org and from PS.org to PO

Mapping could be hosted on PO SVN site. We have a mappings folder already. SN can have access to this.

If an image on PS.org has an association to PO, we can could use that to automaticly create associations.

Advantage of mapping/using keywords rather than links to individual images: Don't want to create xrefs for keywords (like flower) that have hundreds of images.

Potential problem is false positives, if for example keyword is used incorrectly or doesn't exactly match meaning of PO definition.

KN: Contributors don't always put in keywords. This could be fixed by having students or other upgrade PO.org by adding keywords.

PS.org presently can't do boolean search. PO could input a list of words which PS.org could then split out into separate terms, and add wildcards. No need for any additional association file or mappings.

KN could add flags to images that match keywords, then we could click on which images we want to keep -- to avoid false positives.

KN create a table of keywords and ID, we keep them updated with our list of ids

PO is trying to encourage people to use standard vocabulary -- this is why we want to have links to PO terms/IDs.

Could begin mappings, but still have fuzzy search that will complement this.

PJ: (side note) working on software (with collaborator at OSU) to build sectors on images and automatically identify what is on image based on a library.

We have two needs: 1. link to PS.org images from PO 2. link PS.org images to PO IDs.

KN: First step is to implement a search that is sufficient given existing data structure, then move toward system that integrates PO by downloading a table from PO that integrates with his terms.

KN can start by sending us a list of his keywords. But many of them won't be relevant for PO.

PO will use this to create a mapping file.

Need to have a mechanism in place so that additions and changes are updated automatically as much as possible.

One option (to start smaller) is to start with only terms associated with e.g. flower and leaf to see how far we can extend from there -- images that have keyword flower and any keyword associated with it.

Start by getting keywords from KN and doing a candidate mapping.

Standard format of mapping file is on SVN -- we will use same format.

PO is using UTF8 (not unicode) because AmiGO requires UTF, but we may go to unicode in the future.

KN: IP for pages on PS.org will be moving, so we want to stick with www.plantsystematics domain - going onto gigabit server.


KN: list of keywords, work on search function and give us a different url for it.

We will input a list of words, delimited however we want (& or whatever) and KN will write a query that takes those and does a fuzzy search on them then create the link to the correct search results page (he has something like this he has already used for taxonomic names - works with missing letter and internal errors - can transfer this over to keyword search, even though he hasn't done it yet)

Will need to figure out what delimiter to use to separate words: e.g., stamen&anther&filament

May want to have something like po_word:

Eventually, we could add other variables, like: po_word:something;taxon:something.

May also need to think about window dimension later, if we decide to incorporate a pop-up from PO site.

JE: important to test out the links first on our browser, to make sure they work, before we implement this on a large scale.

tasks

short term:

  • KN will send list of all PS.org keywords to RW at PO.
  • RW will work with JP to create a mapping file between PS.org keywords and PO IDs. This will go on PO's SVN and will follow standard mapping file format.
  • JP, JE and RW will work create a few test cases of keyword searches (e.g., some with a single word, some with multiple words) and send these to KN so he can figure out the best format for the url.
  • JE will create the links from pages on the dev version of PO's AmiGO brower, to make sure they work properly.

longer term

  • PO will work on method of automatically adding links to images from PO based on PO term names and synonyms.
  • PO will work with KN to devise a strategy for associating PO id's directly with PO.org images. These can then be used to link back to PO.