Difference between revisions of "PO Release SOP Page"

From Plant Ontology Wiki
Jump to navigationJump to search
Line 105: Line 105:
  
 
=Preparing Ontology Files for Release:=
 
=Preparing Ontology Files for Release:=
 +
==Curators names in definitions, etc==
 +
* Curators names in definitions and synonyms should be spelled out in full example: GR:Pankaj_Jaiswal and these are not links, unless they are POC.
 +
* POC curators should be listed on the wiki page [http://wiki.plantontology.org/index.php/Plant_Ontology_curators POC curators]
 +
 +
* Any reference that begins with POC (e.g., POC:curators, POC:wood_curators, POC:Maria_Alejandra_Gandolfo), will link to the [[Plant_Ontology_curators]] page.
 +
* If the definition is worked out as a group, then the source should be POC: curators.  This should link to a page where all the curators of the POC are listed.
 +
 
==Updating the translations==
 
==Updating the translations==
  

Revision as of 19:04, 19 August 2013

This page is a place to list all the steps we need to take for the database releases.

This page is under development: Let's not reinvent the wheel every 4 months.

Summary of Changes Wiki Pages


For example: see October_2011_Release_Page this has links to the "Summary of Changes" page, e.g.: Summary of Changes to PO October 2011

  • Also should list terms that have been merged, changed definitions, or renamed.
  • If possible, it is also good to list new synonyms for existing terms, especially if those synonyms are quite different from the original name.

For example, we would want to highlight the cone is a synonym of strobilus, but it is not that important to note that portion of epidermal tissue is a synonym for epidermis.

Updating the PO Anatomy Glossary

Plant Anatomy Glossary

The Plant Anatomy glossary is linked to the live database and updates automatically, but it is a good idea to check it after the database is switched.

- 8-19-13 Tracker: Japanese synonyms are not displaying correctly

The Japanese synonyms are not displaying correctly, even though they work ok on the browser

Preparing the Association files for the Release

Existing annotation files:

  • Check for any lines in the existing association files that reference obsolete or alternate IDs (see wikikpage "Changes to the ontology". )
  • You can use grep on the command line to search for the term ids that have been obsoleted.

* Note: even if terms have been merged and an alt id is created, they will not load correctly.- manually fix the annotation file(s) by replacing the old id with the one that is replacing it.

  • Inform the collaborating group about what needs to be fixed and/or fix those ourselves if necessary.
  • In some cases, the affected annotation lines will need to be pulled out of the file.
  • Any problems found in the association files should be fixed in the files in the trunk folder and then copied over to beta, replacing the existing file with the problems. (see below)

This accomplishes two things: it maintains the version history on trunk and makes sure the fixes are not lost.

TAIR

  • It is important to let TAIR know ahead of time that we are preparing to do a release and give them the list of changed or obsoleted terms. They will have to remap those to the new terms in their files. They like to have at least a week.
  • Usually we ask them to stop their automatic weekly commits to the SVN for a couple of weeks while we get everything fixed.

New Annotation files

  • Need to be mapped to PO terms
  • Use appropriate cut-offs for each data set
  • New DBxRefs need to be set up: see below
  • Create a page on the wiki Annotations for the data set to document it

Column 16 annotations (for future releases):

The IT team should run a script to check for needed column 16 annotations; see list at PO Suggestions for Col 16 and more info at: PO Annotation Extensions (column 16)


Note: As number of annotations and the size of the association files increases, this process will take longer. The version #16 required 46 hours to load. So it is a good idea to do this check before we start the test load.

Fixing Links to Dbxrefs:

The was an issue that was discussed and worked on a lot for the version #16 release.

For more details of the discussions see the minutes from POC_Meetings_Minutes on: 9-27-11 through 10-18-11.

Our current version of AmiGO has a built-in set of dbxrefs, in a Perl file, and it is not reading the current GO file. In order to update these, this file has to be edited manually.

We decided on the POC_Conf._Call_9-27-11 to use the PO_DBXref.txt file on the SVN to store the updated dbxrefs stanzas. Then the dbxrefs Perl file can be modified based on that.

  • An older version of this file is also available through the documentation page on the PO website:

The link on the Documentation page should to go to the current file on the SVN. This change has been implemented.

Note from LC: Do we really need to have this link on our website, since it is just for our internal use?

Po-Refs Page

- Links to references that do not have a PubMed ID, ISBN number, or stable URL should go to the new PO references page

Links to PO:ids in term definitions

Starting in the April 2012 release we are putting the PO:ids in both the term definitions and in the comments. Currently, the links in the comment are working but not the ones in the definitions.

There is a stanza in the dbxrefs file which refers to these, but it needs to be updated in the AmiGO browsers.

Icons for new relations

If new relations have been introduced and approved, then the following steps should be taken:

  • Icons need to be stored in the folder on the SVN: icons folder


PJ requested that we embed images, linked to external files, next to each relation, but this may not work because the link view images on the SVN does not appear to work. Also, one is generally required to upload an image before it can be embedded on a wiki page. Tried to set it up using html but the <img> tag did not work on the wiki.


Need to add link to relations page from icons on live browser.

Should share the icons with OBO Foundry, so they can be used consistently across ontologies.

JE: I added the images for this next to each relation section. Links go to Amigo icons, not from svn as that doesn't work. Had to enable allow external images in mediawiki. No need for <img> tag. Just put the link to images you want embedded. These do not seem to be working

Preparing Ontology Files for Release:

Curators names in definitions, etc

  • Curators names in definitions and synonyms should be spelled out in full example: GR:Pankaj_Jaiswal and these are not links, unless they are POC.
  • POC curators should be listed on the wiki page POC curators
  • Any reference that begins with POC (e.g., POC:curators, POC:wood_curators, POC:Maria_Alejandra_Gandolfo), will link to the Plant_Ontology_curators page.
  • If the definition is worked out as a group, then the source should be POC: curators. This should link to a page where all the curators of the POC are listed.

Updating the translations

  • JE has a script to insert them. As of Dec 2012, the insert_translations script is inserting extra spaces after the Japanese name. JE fixed it manually, Note that this only temporarily fixes the obo file. Still need to find out why it is happening.
  • There is a folder on the svn translations. Not sure how this gets updated.
  • In release #20 created a Google doc New and Renamed PO Terms- Release #20 in order to work collaboratively with YY and MAG on getting the translations done. This worked well. Then they were manually inserted into the OBO file. This took time.

Quality Checks and Internal Review:

Before each release, ontology editors and curators should run some qc checks. In fact, these should be done on a regular basis in between releases, but it is crucial to do them immediately before a release.

  • Run the reasoner to remove any redundant links.
  • Do a search for extra space and odd characters in terms, definitions and dbxrefs (see below:).
  • The whole POC team should take a little time to do a internal review. Look for any inconsistencies, typos, broken links etc.
  • Check for any lines in the existing association files that reference obsolete or alternate IDs (see below:)
  • update the translation files
  • Add any new DbxRefs and and fix any broken links
  • Update any annotation files

Located_in: see embryo sac (PO:0025074)

Dev version

Between releases, editing is done on the developers version of the plant_ontology.obo file located at: OBO Format.

  • plant_ontology.obo

This is the "developers" version and is loaded onto the Dev Browser nightly, along with the SVN version #. No annotations are loaded on this browser.

During the preparation of the release, editing may continue on the dev file, without disrupting the version on the live browser or the beta browser (see below).

Once the initial Quality Checks (see above) are completed, the plant_ontology_assert.obo file can be created (see below).

The "plant_ontology_assert.obo" file can be loaded onto the Beta Browser, along with annotation files so we can check for any problems.

(Note that in the past, we moved everything to a new beta branch, prior to the release, but this practice was discontinued after the October 2011 release.)

This is a temporary location for the curators and reviewers to look over the new version of the ontology for the planned release.

  • Note: We should also put the revised PO front page on the Beta Browser for review, prior to publishing.


If a problem is found that needs to be fixed for the release:

In the event that a problem is found in the beta version prior to the release, the fix should be made in the DEV version and the process of generating the additional files and loading the beta browser should be repeated anew.

This ensures that the fixes are not lost from the editor's version in the process of the release. This is why it is important for the editors to do a thorough review and run QC checks prior to creating the "asserted file" and loading it on beta.

Once the editing and quality checks are completed for the release, the following additional versions of the Plant Ontology file are prepared.

This need be done prior to the release as plant_ontology_assert.obo is the file loaded onto the AmiGO browser. (see below for details) Files created for release based on the 'final' version of plant_ontology.obo

  • plant_ontology_assert.obo
    • plant_ontology_assert_basic.obo
      • po_anatomy.obo
      • po_temporal.obo


LC: Note: These files should not remain in the trunk folder with the dev version of the ontology once the release is out as they are no longer in sync with the editors version of the file and this will be confusing to have them in two places.

In hindsight, I think the additional files should only be generated once the review process is completed. If an Ontology problem is found, these will have to be recreated (see below:)

Details of the alternative files for the release

plant_ontology_assert.obo

This version is the same as plant_ontology.obo with all nonredundant implied links asserted (added) by a reasoner, we generally use the OBO-Release manager Oort.

The "assert implied links" panel in OboEdit does not work very well.

To create plant_ontology_assert.obo:

  • open the desktop GUI for Oort (OBO-Release manager)
  • browse to the latest version of plant_ontology.obo as the input file
  • specify a folder for the output (first create a new folder on your hard drive for the files and some subfolders created by Oort)
  • on the "Advanced" tab, uncheck all options.
  • select only "OBO" as a Write Format
  • choose either the HermiT or the Pellet reasoner (results should be the same)
  • click run
  • rename the output file "po.obo" as "plant_ontology_assert.obo" and save in the appropriate directory
  • Note that unlike OboEdit, Oort also asserts the relations that are specified in any intersection_of lines. For example, if leaf sinus is defined as intersection_of is_a sinus and intersection_of part_of leaf, Oort will assert "leaf sinus part_of leaf". "Assert implied relations" in OE will NOT do this, but the relation will show up in graphical view if the reasoner is on.

plant_ontology_assert_basic.obo

This version is the same as plant_ontology_assert.obo, but with only is_a and part_of relations. This file is created using either a filtered save in OboEdit or a query in Oort.

To create plant_ontology_assert_basic.obo in OboEdit:

  • open plant_ontology_assert.obo
  • select save as, and use advanced save
  • set the save path to .../plant_ontology_assert_basic.obo
  • check "Filter links" only
  • set up 2 filters to "Matches any": "type, have, any text field, contains, is_a", and "type, have, any textfield, contains, part_of"
  • Notes from RW: I can supply the filter as a text file (assert_basic_filter.txt);

I used "Greedy root selection algorithm" and "Don't write current ID rules", but it doesn't matter for the substantive parts of the file.

  • Add a comment "Filtered from plant_ontology_assert.obo to have only is_a and part_of relations. Matches plant_ontology.obo version ####."
  • Save and re-open the file in OboEdit
  • You will see a warning that a number of terms have exactly one intersection_of relation. These are the cross products that were set up using participates_in, has_participant, or develops_from.
  • These must be removed manually (in the text file), but there aren't many of them. There is probably a way to filter out all intersection_of relations with OboEdit too (they aren't needed, since all implied links have already been asserted).
  • Remove any intersection_of tags that are causing problems using a text editor (or OboEdit, if it will work) and save the file.

It might be a better practice to just remove all of the intersection_of lines from the file.

Separate Aspect files:

See discussion on POC_Conf._Call_4-17-12

We will keep generating these and posting them on Bioportal through the next release #18. After that people will have to come to the SVN to obtain them.

  • Alert users that we will stop supplying these files after release 18. Bioportal should only serve plant_ontology.obo (or does it serve plant_ontology_assert.obo?)
  • Will these still be on Bioportal as "legacy data"?; What about the GRO Ontologies? there are also causing the same issue
  • Who else is still using these files?
  • Need to check with Gramene and TAIR and SGN to see if they are still using those files -

TAIR is not, Gramene? sent inquiry 7-13-12, waiting for reply from Gramene

These should really be called po_temporal_assert_basic.obo/ po_anatomy_assert_basic.obo , but if this change is made, must fix links on download page and if any users are accessing it

These files were originally created after the change in January 2011, as our collaborators at TAIR still needed the separate files of the ontology. Since they preferred the 'basic' form of the onotlogy file, these were derived from the plant_ontology_assert_basic.obo file. Thus these really should be called po_anatomy_assert_basic.obo and po_temporal_assert_basic.obo.

For other users such as SGN, they would prefer to have all the relations in the files, except for the ones that go between the two branches (ie: participates_in).

Separate aspect files: Alert users that after release #18, we will not be offering these on Bioportal any longer. See notes of 4-17-12 meeting

po_anatomy.obo

This file is created by filtering plant_ontology_assert_basic.obo to contain only terms from the plant anatomical entity branch. Instruction below describe how to do this in OboEdit, but in theory it could be done using Oort as well.

To create po_anatomy.obo in OboEdit:

  • Open plant_ontology_assert_basic.obo
  • Manually remove any replaced_by or consider relations that go between the two branches of the ontology (these include: gametophyte, seedling, sporophyte).
  • remove:

gametophyte (PO:0009004); replaced_by: PO:0028003 gametophyte development stage

sporophyte (PO:0009003) replaced_by: PO:0028002 sporophyte development stage

seedling PO:0008037; replaced_by: PO:0007131 seedling development stage

  • After removing these links, save the file as some dummy name, so that you can use it to create po_temporal.obo.
  • Open advanced save dialog and set the save path to .../po_anatomy.obo
  • Check "Filter terms" only
  • Set up filter to "Have, namespace, contains, plant_anatomy"
  • Check "always save properties"
  • Add a comment "Filtered from plant_ontology_assert_basic.obo to have only plant anatomical entity terms. Matches plant_ontology.obo version ####."
  • Save and check by re-opening OboEdit

po_temporal.obo

This file is created by filtering plant_ontology_assert_basic.obo to contain only terms from the plant structure development stage branch. Instruction below describe how to do this in OboEdit, but in theory it could be done using Oort as well.

To create po_temporal.obo in OboEdit:

  • open plant_ontology_assert_basic.obo
  • manually remove any replaced_by or consider relations that go between the two branches of the ontology (these include: gametophyte, seedling, sporophyte).
  • open advanced save dialog
  • set the save path to .../po_temporal.obo
  • check "Filter terms" only
  • set up filter to "Have, namespace, contains, plant_structure_development_stage"
  • check "always save properties"
  • add a comment "Filtered from plant_ontology_assert_basic.obo to have only plant structure development stage terms. Matches plant_ontology.obo version ####."
  • save and check by re-opening OboEdit

Tab-delimited version files:

  • po_all.tbl, (Should be called plant_ontology.tbl Note: when this change is made, we will need to fix the link from the download page)
  • po_anatomy.tbl
  • po_temporal.tbl

These are generated from the MySql database directly and not from the obo files.

These contain the PO:id, term name, the definition, any synonyms (except the Japanese translations) and the aspect (in po_all.tbl)

Oct 2011: It would also be good to have a column listing the alt_ids.

Note: These files also include the obsoleted terms. This is indicated in the definition field of those terms.

Need to double check the numbers- #18, 1609 total terms, 131 obsoletes labeled, 1476 terms (stats page);

missing these 2

  • obsolete plant growth and developmental stage terms PO:0007532
  • obsolete growth and development terms PO:0007060

RW: These are old terms that never had definitions. We should still add OBSOLETE to the definitions. I will do this next time I'm editing.


Why do some terms have obsolete in the name and not others?

RW: Other than the two oddball terms listed above, any terms that have "obsolete" in their name are terms that have the same name as a live term. The "obsolete" was added because OE did not let you have two terms with the same primary name. I think it does now, but we still kept the practice, just so people wouldn't accidentally use the obsolete term that has the same name.

These files are be located at TBL files on SVN trunk and the po_all.tbl file is copied over to the live tag.

The link from the PO download page goes to the po_all.tbl Live tag.

Note: Only the po_all.tbl file is available at the Live tag.

It is confusing to have these located in two different locations- this is a problem if changes are made in the file on live which are not reflected in the trunk version. So if any chnges are required, they must be made in the file on Trunk and a new .tbl file generated.

OWL version

An OWL version of the ontology file is generated for each release, which is located in the SVN trunk folder at: OWL_Format.

This file is generated from the plant_ontology.obo file at the time of the release, and subsequently copied to Live Tag.

Creation is done by executing the command "obolib-obo2owl plant_ontology.obo -o plant_ontology.owl". This file then needs to be hand edited to include the version info. The following line goes right above the </owl:Ontology> line:

<owl:versioninfo>xml:lang="en" version ##</owl:versioninfo>

where ## is the version number.

OWL files available on trunk:

  • plant_ontology.owl (This corresponds to plant_ontology.obo from the live tag)

Older files- no longer being updated:

  • po_anatomy.owl
  • po_temporal.owl

Only the plant_ontology.owl file is moved over to the live tag.

-We need a 'Readme' file for this folder and some notes on how this is generated

External Reviews

Once we have completed the editing, the Ontology should be open for review on the Beta browser. This may occur with or without the full set of annotations. Depending on what we want to test. (see below for more details on Beta)

Inviting Specific External Reviewers

For some releases, we may ask specific experts and or collaborators to take part in an External Review. This was done in Aug 2010 for the October 2010 release. The external experts will need at least a couple of weeks and up to one month.

To facilitate the review, the POC may need to make a presentation to them, such as the Plant_Ontology_Webinar-_May_2011_release PO-Physcomitrella presentation, to feature the new terms for non-vascular plants, in particular, Physcomitrella patens.


Add more details here of the external review process...

Beta Release Announcement

An announcement should be sent out on the mailing lists something like this:

"A beta version of the latest release of the PO is now available on our AmiGO Browser (http://beta.plantontology.org/amigo/go.cgi). Along with the Ontology itself the beta browser also is loaded with some test annotation files (not the full set of files at this point).

You can download the ontology files in OBO-Edit format from: http://palea.cgrb.oregonstate.edu/viewsvn/Poc/branches/beta/ontology/OBO_format/

There is a readme.txt file on the same page explaining what the is different among the files available.

Further information on the upcoming release can be obtained from: Links to Release Pages: "Summary of Changes" and "New and Obsoleted Terms"

Please review the Plant Ontology and let us know asap about any important issues you observe."

Loading the Live Version

Once the review and internal checks and fixes are completed, the plant_ontology_assert.obo file and the whole database of annotations are loaded from the beta branch onto the beta browser.

Note that it is the "plant_ontology_assert.obo " file that should be loaded onto AmiGO browser on the live site.

Once the release is live on the Plant Ontology page, the ontology files should be copied from the beta branch to Live Tag.

This is a stable url that will not change, so it can be incorporated into users scripts etc.


In the October 2011 release, we introduced Japanese translations of the term names. Currently, JE is stripping out the Japanese synonyms before he loads the live version onto AmiGO. We had discussed whether or not to remove them from the live release and provide a separate file which included the translated terms, but that was rejected (meeting date: fill in date...). We have had complaints from some of our user groups (eg. Gramene) that their browsers cannot display the Japanese characters . We should reconsider this approach.

Changes to the HTML Files for the PO web site:

Prepare "Release Notes" page

  • Each Release and any Interim Data Release should have a unique Release Notes webpage on the PO site detailing the current statistics of the release.
  • The old pages are archived Release Notes Archive so that we can keep the track of the changes.


Update PO front page and other pages

  • Post release announcements
  • Update any "Upcoming Events", move older event to the archive page


  • Update the "Filter Annotation Objects Counts" for new species, data types in release


This good time to incorporate any changes or fixes to the web pages:

List needed fixes here:

  • Documentation page: replace link to old dbxrefs file with a link to the current one on the svn.

Announcements

  • Prepare and post announcements for the PO front page, Jaiswal Lab Page, FB page
  • Send out announcements to the mailing lists

other places??


Updating the Plant Ontology files on other sites:

The two main sites where people go to download the PO are:

Also Ontobee, but that pulls the file from http://purl.obolibrary.org/obo/po.owl, so it should always be up to date.

Also featured at:

  • TAIR:

See this page for links to other sites that feature the PO: Links_to_sites_using_the_PO

Updating the "Filter Annotation Objects" lists on the AmiGO browser:

  • If annotations from new database groups or species are added, we should update the "Filter Annotation Objects" list on the browser

These lists should match what is in our database; as listed on the release notes page.

LC: new in V#16: Jaiswal Lab: Fragaria vesca (strawberry) and Gramene: Physcomitrella

In Jan 2012 16A release, we added a large set from Physcomitrella from Cosmoss. Mar 2012 release will include new ones from AgBase for cotton (Gossypium spp.) and new species for SGN.

We also need SGN germplasm which has 4503 annotations

  • It would be good to be able to filter by species

JE: This is handled by the freez_hash_misc_keys script. Make the changes in that file, run the script and copy the created file to the cgi-bin/plantontology directory.