Difference between revisions of "PO Release SOP Page"

From Plant Ontology Wiki
Jump to navigationJump to search
Line 235: Line 235:
 
''TAIR is not, Gramene? sent inquiry 7-13-12, waiting for reply from Gramene''
 
''TAIR is not, Gramene? sent inquiry 7-13-12, waiting for reply from Gramene''
  
''These files were originally created after the change in January 2011, as our collaborators at TAIR still needed the separate files of the ontology.  Since they preferred the 'basic' form of the onotlogy file, these were derived from the plant_ontology_assert_basic.obo file.  Thus these really should be called po_anatomy_assert_basic.obo and po_temporal_assert_basic.obo.  but if this change is made, must fix links on download page and if any users are accessing it''''
+
''These files were originally created after the change in January 2011, as our collaborators at TAIR still needed the separate files of the ontology.  Since they preferred the 'basic' form of the onotlogy file, these were derived from the plant_ontology_assert_basic.obo file.''  
 +
 
 +
''Thus these really should be called po_anatomy_assert_basic.obo and po_temporal_assert_basic.obo.  but if this change is made, must fix links on download page and if any users are accessing it.''
  
 
''For other users such as SGN, they would prefer to have all the relations in the files, except for the ones that go between the two branches (ie: participates_in).''
 
''For other users such as SGN, they would prefer to have all the relations in the files, except for the ones that go between the two branches (ie: participates_in).''

Revision as of 20:52, 20 August 2013

This page is a place to list all the steps we need to take for the database releases.

This page is under development: Let's not reinvent the wheel every 4 months.

Summary of Changes Wiki Pages


For example: see October_2011_Release_Page this has links to the "Summary of Changes" page, e.g.: Summary of Changes to PO October 2011

  • Also should list terms that have been merged, changed definitions, or renamed.
  • If possible, it is also good to list new synonyms for existing terms, especially if those synonyms are quite different from the original name.

For example, we would want to highlight the cone is a synonym of strobilus, but it is not that important to note that portion of epidermal tissue is a synonym for epidermis.

Preparing the Association files for the Release

Existing annotation files:

  • Check for any lines in the existing association files that reference obsolete or alternate IDs (see wiki page "Changes to the ontology". )
  • You can use grep on the command line to search for the term ids that have been obsoleted.

* Note: even if terms have been merged and an alt id is created, they will not load correctly.- manually fix the annotation file(s) by replacing the old id with the one that is replacing it.

  • Inform the collaborating group about what needs to be fixed and/or fix those ourselves if necessary.
  • In some cases, the affected annotation lines will need to be pulled out of the file.
  • Any problems found in the association files should be fixed in the files in the trunk folder and then copied over to beta, replacing the existing file with the problems. (see below)

This accomplishes two things: it maintains the version history on trunk and makes sure the fixes are not lost.

Note: As number of annotations and the size of the association files increases, this process will take longer. The version #16 required 46 hours to load. So it is a good idea to do this check before we start the test load.


TAIR

  • It is important to let TAIR know ahead of time that we are preparing to do a release and give them the list of changed or obsoleted terms. They will have to remap those to the new terms in their files. They like to have at least a week.
  • Usually we ask them to stop their automatic weekly commits to the SVN for a couple of weeks while we get everything fixed.

New Annotation files

  • Need to be mapped to PO terms
  • Use appropriate cut-offs for each data set
  • New DBxRefs need to be set up: see below
  • Create a page on the wiki Annotations for the data set to document it

Column 16 annotations (for future releases):

The IT team should run a script to check for needed column 16 annotations; see list at PO Suggestions for Col 16 and more info at: PO Annotation Extensions (column 16)

Fixing Links to Dbxrefs:

The was an issue that was discussed and worked on a lot for the version #16 release.

For more details of the discussions see the minutes from POC_Meetings_Minutes on: 9-27-11 through 10-18-11.

Our current version of AmiGO has a built-in set of dbxrefs, in a Perl file, and it is not reading the current GO file. In order to update these, this file has to be edited manually.

We decided on the POC_Conf._Call_9-27-11 to use the PO_DBXref.txt file on the SVN to store the updated dbxrefs stanzas. Then the dbxrefs Perl file can be modified based on that.

  • An older version of this file is also available through the documentation page on the PO website:

The link on the Documentation page should to go to the current file on the SVN. This change has been implemented.

Note from LC: Do we really need to have this link on our website, since it is just for our internal use?

Icons for new relations

If new relations have been introduced and approved, then the following steps should be taken:

  • Icons need to be stored in the folder on the SVN: icons folder


PJ requested that we embed images, linked to external files, next to each relation, but this may not work because the link view images on the SVN does not appear to work. Also, one is generally required to upload an image before it can be embedded on a wiki page. Tried to set it up using html but the <img> tag did not work on the wiki.


There is a link to relations page from icons on live browser.

Should share the icons with OBO Foundry, so they can be used consistently across ontologies.

JE: I added the images for this next to each relation section. Links go to Amigo icons, not from svn as that doesn't work. Had to enable allow external images in mediawiki. No need for <img> tag. Just put the link to images you want embedded. These do not seem to be working

Preparing Ontology Files for Release:

Curators names in definitions, etc

  • Curators names in definitions and synonyms should be spelled out in full example: GR:Pankaj_Jaiswal and these are not links, unless they are POC.
  • POC curators should be listed on the wiki page POC curators
  • Any reference that begins with POC (e.g., POC:curators, POC:wood_curators, POC:Maria_Alejandra_Gandolfo), will link to the Plant_Ontology_curators page.
  • If the definition is worked out as a group, then the source should be POC: curators. This should link to a page where all the curators of the POC are listed.

Updating the translations

  • There is a folder on the svn translations. Not sure how this gets updated.
  • JE has a script to insert them. As of Dec 2012, the insert_translations script is inserting extra spaces after the Japanese name. JE fixed it manually, Note that this only temporarily fixes the obo file. Still need to find out why it is happening.

This worked well. Then they were manually inserted into the OBO file. This took time.

Po-Refs Page

- Links to references that do not have a PubMed ID, ISBN number, or stable URL should go to the new PO references page

Links to PO:ids in term definitions

Starting in the April 2012 release we are putting the PO:ids in both the term definitions and in the comments. Currently, the links in the comment are working but not the ones in the definitions.

There is a stanza in the dbxrefs file which refers to these, but it needs to be updated in the AmiGO browsers.

Dev version

Between releases, editing is done on the developers version of the plant_ontology.obo file located at: OBO Format.

  • plant_ontology.obo

This is the "developers" version and is loaded onto the Dev Browser nightly, along with the SVN version #. No annotations are loaded on this browser.

During the preparation of the release, editing may continue on the dev file, without disrupting the version on the live browser or the beta browser (see below).

Once the initial Quality Checks (see above) are completed, the plant_ontology_assert.obo file can be created (see below).

The "plant_ontology_assert.obo" file can be loaded onto the Beta Browser, along with annotation files so we can check for any problems.

(Note that in the past, we moved everything to a new beta branch, prior to the release, but this practice was discontinued after the October 2011 release.)

This is a temporary location for the curators and reviewers to look over the new version of the ontology for the planned release.

  • Note: We should also put the revised PO front page on the Beta Browser for review, prior to publishing.


If a problem is found that needs to be fixed for the release:

In the event that a problem is found in the beta version prior to the release, the fix should be made in the DEV version and the process of generating the additional files and loading the beta browser should be repeated anew.

This ensures that the fixes are not lost from the editor's version in the process of the release. This is why it is important for the editors to do a thorough review and run QC checks prior to creating the "asserted file" and loading it on beta.

Once the editing and quality checks are completed for the release, the following additional versions of the Plant Ontology file are prepared.

This need be done prior to the release as plant_ontology_assert.obo is the file loaded onto the AmiGO browser. (see below for details) Files created for release based on the 'final' version of plant_ontology.obo

  • plant_ontology_assert.obo
    • plant_ontology_assert_basic.obo
      • po_anatomy.obo
      • po_temporal.obo


LC: Note: These files should not remain in the trunk folder with the dev version of the ontology once the release is out as they are no longer in sync with the editors version of the file and this will be confusing to have them in two places.

In hindsight, I think the additional files should only be generated once the review process is completed. If an Ontology problem is found, these will have to be recreated (see below:)

Details of the alternative files for the release

plant_ontology_assert.obo

This version is the same as plant_ontology.obo with all nonredundant implied links asserted (added) by a reasoner,

  • Use the OBO-Release manager Oort, as the "assert implied links" panel in OBO-Edit does not work very well.

To create plant_ontology_assert.obo:

1. Create a new temp folder on your hard drive for the files and some subfolders created by Oort

2. Open the desktop GUI for Oort OBO-Release manager

3. Browse to the latest version of plant_ontology.obo as the input file

4. Specify the folder you created for the output- browse to it.

5. On the "Advanced" tab, uncheck all options.

6. Select only "OBO" as a Write Format

7. Choose either the HermiT or the Pellet reasoner (results should be the same)

8. Click 'run'

9. Rename the output file "po.obo" as "plant_ontology_assert.obo" and save in the appropriate directory for the SVN

10. Use SVN add, then SVN commit to upload to svn

  • Note that unlike OBO-Edit, Oort also asserts the relations that are specified in any intersection_of lines. For example, if leaf sinus is defined as intersection_of is_a sinus and intersection_of part_of leaf, Oort will assert "leaf sinus part_of leaf". "Assert implied relations" in OE will NOT do this, but the relation will show up in graphical view if the reasoner is on.

plant_ontology_assert_basic.obo

This version is the same as plant_ontology_assert.obo, but with only the is_a and part_of relations. This file is created using a filtered save in OBO-Editor (or you can do with a query in Oort, although I have no experience with it) .

To create plant_ontology_assert_basic.obo in OBO-Edit:

1. Open plant_ontology_assert.obo from the [ Live tag on SVN

2. Select save as and use Advanced save

3. Create a new save path: Set the save path to .....Poc/tags/live/plant_ontology_assert_basic.obo

4. Check "Filter links" only

5. Setting up 2 filters with "Matches any" :

  • Find links where: "Type" "have" a: "Any text field" that: "contains", the value: "is_a"
  • Find links where: "Type" "have" a: "Any text field" that: "contains", the value: "part_of"

6. Select: "Greedy root selection algorithm" and "Don't write current ID rules" (but it doesn't matter for the substantive parts of the file.)

7. Add a comment "Filtered from plant_ontology_assert.obo to have only is_a and part_of relations. Matches plant_ontology.obo Release version ####."

7. Save and re-open the file in OBO-Edit- You will see a warning that a number of terms have exactly one intersection_of relation. These are the cross products that were set up using participates_in, has_participant, or develops_from.

Note in version #20, there were only 7 of these:

  • flower development stage (PO:0007615): collective plant organ structure development stage (PO:0025338) has_participant flower (PO:0009046)
  • gametophyte meristematic apical cell (PO:0030014): meristematic apical cell (PO:0030007) participates_in gametophyte development stage (PO:0028003)
  • plant embryo stage (PO:0007631): sporophyte development stage (PO:0028002) has_participant plant embryo (PO:0009009)
  • plant spore stage (PO:0025375): gametophyte development stage (PO:0028003) has_participant plant spore (PO:0025017)
  • primary vascular tissue (PO:0025408): portion of vascular tissue (PO:0009015) develops_from procambium (PO:0025275)
  • secondary vascular tissue (PO:0025409): portion of vascular tissue (PO:0009015) develops_from vascular cambium (PO:0005598)
  • sporophyte meristematic apical cell (PO:0030015): meristematic apical cell (PO:0030007) participates_in sporophyte development stage (PO:0028002)

8. Removing those links in OBO-Edit (Can also be removed manually, in the text file). They aren't needed, since all implied links have already been asserted

  • Setting up a filter with "Matches any": Find links where: "Self" "don't have" a: "Is intersection" that: "contains", the value: leave blank

see: Link Filtering

9. Save and re-open the file- the error messages should be gone.

Separate Aspect files:

See discussion on POC_Conf._Call_4-17-12

We will keep generating these and posting them on Bioportal through the next release #18. After that people will have to come to the SVN to obtain them.

  • Alert users that we will stop supplying these files after release 18. Bioportal should only serve plant_ontology.obo (or does it serve plant_ontology_assert.obo?)
  • Will these still be on Bioportal as "legacy data"?;
  • What about the GRO Ontologies? there are also causing the same issue
  • Who else is still using these files?
  • Need to check with Gramene and TAIR and SGN to see if they are still using those files -

TAIR is not, Gramene? sent inquiry 7-13-12, waiting for reply from Gramene

These files were originally created after the change in January 2011, as our collaborators at TAIR still needed the separate files of the ontology. Since they preferred the 'basic' form of the onotlogy file, these were derived from the plant_ontology_assert_basic.obo file.

Thus these really should be called po_anatomy_assert_basic.obo and po_temporal_assert_basic.obo. but if this change is made, must fix links on download page and if any users are accessing it.

For other users such as SGN, they would prefer to have all the relations in the files, except for the ones that go between the two branches (ie: participates_in).

Separate aspect files: Alert users that after release #18, we will not be offering these on Bioportal any longer. See notes of 4-17-12 meeting


po_anatomy.obo

This file is created by filtering plant_ontology_assert_basic.obo to contain only terms from the plant anatomical entity branch. Instruction below describe how to do this in OboEdit, but in theory it could be done using Oort as well.

To create po_anatomy.obo in OboEdit:

  • Open plant_ontology_assert_basic.obo
  • Manually remove any replaced_by or consider relations that go between the two branches of the ontology (these include: gametophyte, seedling, sporophyte).
  • remove:

gametophyte (PO:0009004); replaced_by: PO:0028003 gametophyte development stage

sporophyte (PO:0009003) replaced_by: PO:0028002 sporophyte development stage

seedling PO:0008037; replaced_by: PO:0007131 seedling development stage

  • After removing these links, save the file as some dummy name, so that you can use it to create po_temporal.obo.
  • Open advanced save dialog and set the save path to .../po_anatomy.obo
  • Check "Filter terms" only
  • Set up filter to "Have, namespace, contains, plant_anatomy"
  • Check "always save properties"
  • Add a comment "Filtered from plant_ontology_assert_basic.obo to have only plant anatomical entity terms. Matches plant_ontology.obo version ####."
  • Save and check by re-opening OboEdit

po_temporal.obo

This file is created by filtering plant_ontology_assert_basic.obo to contain only terms from the plant structure development stage branch. Instruction below describe how to do this in OboEdit, but in theory it could be done using Oort as well.

To create po_temporal.obo in OboEdit:

  • open plant_ontology_assert_basic.obo
  • manually remove any replaced_by or consider relations that go between the two branches of the ontology (these include: gametophyte, seedling, sporophyte).
  • open advanced save dialog
  • set the save path to .../po_temporal.obo
  • check "Filter terms" only
  • set up filter to "Have, namespace, contains, plant_structure_development_stage"
  • check "always save properties"
  • add a comment "Filtered from plant_ontology_assert_basic.obo to have only plant structure development stage terms. Matches plant_ontology.obo version ####."
  • save and check by re-opening OboEdit

Tab-delimited version files:

These files contain the PO:ID, term name, the definition, any synonyms and the aspect (in po_all.tbl)

Oct 2011: It would also be good to have a column listing the alt_ids.

Note: These files also include the obsoleted terms. This is indicated in the definition field of those terms and in some cases, in the term name. Any terms that have "obsolete" in their name are terms that have the same name as a live term.

The "obsolete" was added because older versions of OBO-Edit did not let you have two terms with the same primary name. I think it does now, but we still kept the practice, just so people wouldn't accidentally use the obsolete term that has the same name.

Note: These are generated from the MySql database directly and not from the obo files.

po_all.tbl

This should be called plant_ontology.tbl Note: when this change is made, we will need to fix the link from the download page

Also need to change it in the script

po_anatomy.tbl and po_temporal.tbl

po_synonyms.tbl

what is this file?

Location of table files

Note: Only the po_all.tbl file is available at the Live tag.

It is confusing to have these located in two different locations- this is a problem if changes are made in the file on live which are not reflected in the trunk version. So if any changes are required, they must be made in the file on Trunk and a new .tbl file generated.

OWL version

An OWL version of the ontology file is generated for each release, which is located in the SVN trunk folder at: OWL_Format.

This file is generated from the plant_ontology.obo file at the time of the release, and subsequently copied to Live Tag.

Creation is done by executing the command "obolib-obo2owl plant_ontology.obo -o plant_ontology.owl". This file then needs to be hand edited to include the version info. The following line goes right above the </owl:Ontology> line:

<owl:versioninfo>xml:lang="en" version ##</owl:versioninfo>

where ## is the version number.

OWL files available on trunk:

  • plant_ontology.owl (This corresponds to plant_ontology.obo from the live tag)

Older files- no longer being updated:

  • po_anatomy.owl
  • po_temporal.owl

Only the plant_ontology.owl file is moved over to the live tag.

-We need a 'Readme' file for this folder and some notes on how this is generated

  • New script creates a owl file from dev every night- need to add some info about this.

PO Anatomy Glossary

Plant Anatomy Glossary

The Plant Anatomy glossary is linked to the live database and updates automatically, but it is a good idea to check it after the database is switched.

Quality Checks and Reviews:

Before each release, ontology editors and curators should run some qc checks. In fact, these should be done on a regular basis in between releases, but it is crucial to do them immediately before a release.

  • Run the reasoner to remove any redundant links.
  • Do a search for extra space and odd characters in terms, definitions and dbxrefs (see above).
  • Check for any lines in the existing association files that reference obsolete or alternate IDs (see above)
  • update the translation files
  • Add any new DbxRefs and and fix any broken links
  • Update any annotation files

Located_in: see embryo sac (PO:0025074)

Internal Reviews

  • Once we have completed the editing, the Ontology should be open for review on the Beta browser
  • The whole POC team should take a little time to do a internal review. Look for any inconsistencies, typos, broken links etc.
  • This may occur with or without the full set of annotations. Depending on what we want to test. (see below for more details on Beta)

Inviting Specific External Reviewers

For some releases, we may ask specific experts and or collaborators to take part in an External Review. This was done in Aug 2010 for the October 2010 release. The external experts will need at least a couple of weeks and up to one month.

To facilitate the review, the POC may need to make a presentation to them, such as the Plant_Ontology_Webinar-_May_2011_release PO-Physcomitrella presentation, to feature the new terms for non-vascular plants, in particular, Physcomitrella patens.


Add more details here of the external review process...

Beta Release Announcement

An announcement should be sent out on the mailing lists something like this:

"A beta version of the latest release of the PO is now available on our AmiGO Browser (http://beta.plantontology.org/amigo/go.cgi). Along with the Ontology itself the beta browser also is loaded with some test annotation files (not the full set of files at this point).

You can download the ontology files in OBO-Edit format from: http://palea.cgrb.oregonstate.edu/viewsvn/Poc/branches/beta/ontology/OBO_format/

There is a readme.txt file on the same page explaining what the is different among the files available.

Further information on the upcoming release can be obtained from: Links to Release Pages: "Summary of Changes" and "New and Obsoleted Terms"

Please review the Plant Ontology and let us know asap about any important issues you observe."

Loading the Live Version

Once the review and internal checks and fixes are completed, the plant_ontology_assert.obo file and the whole database of annotations are loaded from the beta branch onto the beta browser.

Note that it is the "plant_ontology_assert.obo " file that should be loaded onto AmiGO browser on the live site.

Once the release is live on the Plant Ontology page, the ontology files should be copied from the beta branch to Live Tag.

This is a stable url that will not change, so it can be incorporated into users scripts etc.


In the October 2011 release, we introduced Japanese translations of the term names. Currently, JE is stripping out the Japanese synonyms before he loads the live version onto AmiGO. We had discussed whether or not to remove them from the live release and provide a separate file which included the translated terms, but that was rejected (meeting date: fill in date...). We have had complaints from some of our user groups (eg. Gramene) that their browsers cannot display the Japanese characters . We should reconsider this approach.

Changes to the HTML Files for the PO web site:

Prepare "Release Notes" page

  • Each Release and any Interim Data Release should have a unique Release Notes webpage on the PO site detailing the current statistics of the release.
  • The old pages are archived Release Notes Archive so that we can keep the track of the changes.


Update PO front page and other pages

  • Post release announcements
  • Update any "Upcoming Events", move older event to the archive page


  • Update the "Filter Annotation Objects Counts" for new species, data types in release


This good time to incorporate any changes or fixes to the web pages:

List needed fixes here:

  • Documentation page: replace link to old dbxrefs file with a link to the current one on the svn.

Announcements

  • Prepare and post announcements for the PO front page, Jaiswal Lab Page, FB page
  • Send out announcements to the mailing lists

other places??


Updating the Plant Ontology files on other sites:

The two main sites where people go to download the PO are:

Also Ontobee, but that pulls the file from http://purl.obolibrary.org/obo/po.owl, so it should always be up to date.

Also featured at:

  • TAIR:

See this page for links to other sites that feature the PO: Links_to_sites_using_the_PO

Updating the "Filter Annotation Objects" lists on the AmiGO browser:

  • If annotations from new database groups or species are added, we should update the "Filter Annotation Objects" list on the browser

These lists should match what is in our database; as listed on the release notes page.

LC: new in V#16: Jaiswal Lab: Fragaria vesca (strawberry) and Gramene: Physcomitrella

In Jan 2012 16A release, we added a large set from Physcomitrella from Cosmoss. Mar 2012 release will include new ones from AgBase for cotton (Gossypium spp.) and new species for SGN.

We also need SGN germplasm which has 4503 annotations

  • It would be good to be able to filter by species

JE: This is handled by the freez_hash_misc_keys script. Make the changes in that file, run the script and copy the created file to the cgi-bin/plantontology directory.