Difference between revisions of "PO Release SOP Page"

From Plant Ontology Wiki
Jump to navigationJump to search
Line 343: Line 343:
 
* Note: If these files are created from plant_ontology_assert.obo you will have to remove these relations that cross the two branches, addition to the ones above.
 
* Note: If these files are created from plant_ontology_assert.obo you will have to remove these relations that cross the two branches, addition to the ones above.
  
==Tab-delimited version files:==   
+
==Text (tab-delimited) version files:==   
  
 
These files contain the PO:ID, term name, the definition, any synonyms and the aspect (in po_ontology.txt)  
 
These files contain the PO:ID, term name, the definition, any synonyms and the aspect (in po_ontology.txt)  

Revision as of 22:40, 9 October 2015

This page is a place to list all the steps we need to take for the database releases.

This page is under development: Let's not reinvent the wheel every 4 months.

Summary of Changes Wiki Pages


For example: see October_2011_Release_Page this has links to the "Summary of Changes" page, e.g.: Summary of Changes to PO October 2011

  • Also should list terms that have been merged, changed definitions, or renamed.
  • If possible, it is also good to list new synonyms for existing terms, especially if those synonyms are quite different from the original name.

For example, we would want to highlight the cone is a synonym of strobilus, but it is not that important to note that portion of epidermal tissue is a synonym for epidermis.

Preparing the Association files for the Release

Existing annotation files:

  • Check for any lines in the existing association files that reference obsolete or alternate IDs (see wiki page "Changes to the ontology". )
  • You can use grep on the command line to search for the term ids that have been obsoleted.

* Note: even if terms have been merged and an alt id is created, they will not load correctly.- manually fix the annotation file(s) by replacing the old id with the one that is replacing it.

  • Inform the collaborating group about what needs to be fixed and/or fix those ourselves if necessary.
  • In some cases, the affected annotation lines will need to be pulled out of the file.
  • Any problems found in the association files should be fixed in the files in the po-associations and then copied over to beta, replacing the existing file with the problems. (see below)

This accomplishes two things: it maintains the version history on SVN and makes sure the fixes are not lost.

Note: As number of annotations and the size of the association files increases, this process will take longer. The version #16 required 46 hours to load. So it is a good idea to do this check before we start the test load.


Note: See also the checks listed here: Main_Page#Preparation_of_Ontology_Annotation_Files

TAIR

  • It is important to let TAIR know ahead of time that we are preparing to do a release and give them the list of changed or obsoleted terms. They will have to remap those to the new terms in their files. They like to have at least a week.
  • Note that they no longer do their automatic weekly commits to the SVN, so this may not be an issues anymore.

New Annotation files

  • Need to be mapped to PO terms
  • Use appropriate cut-offs for each data set
  • New DBxRefs need to be set up: see below
  • Create a page on the wiki Category:Annotations for the data set to document it

po_associations.zip file

This should be updated with the current version of the association files at each release

There is a link to this from the Download page

Should have a warning that the folder is big: 377.1 MB for 47 items for release #19, zipped file is ~30 MB

Column 16 annotations (for future releases):

The IT team should run a script to check for needed column 16 annotations; see list at PO Suggestions for Col 16 and more info at: PO Annotation Extensions (column 16)

Fixing Links to Dbxrefs:

The was an issue that was discussed and worked on a lot for the version #16 release.

For more details of the discussions see the minutes from POC_Meetings_Minutes on: 9-27-11 through 10-18-11.

Our current version of AmiGO has a built-in set of dbxrefs, in a Perl file, and it is not reading the current GO file. In order to update these, this file has to be edited manually.

We decided on the POC_Conf._Call_9-27-11 to use the PO_DBXref.txt file, now stored on the GitHub Planteome/common-files-for-ref-ontologies repository to store the updated dbxrefs stanzas. Then the dbxrefs Perl file can be modified based on that.

Links to PO:ids in term definitions

Starting in the April 2012 release we are putting the PO:ids in both the term definitions and in the comments. Currently, the links in the comment are working but not the ones in the definitions.

There is a stanza in the dbxrefs file which refers to these, but it needs to be updated in the AmiGO browsers.

Icons for new relations

If new relations have been introduced and approved, then the following steps should be taken:

  • Details of the new relation should be listed on the wiki page: Relations_in_the_Plant_Ontology. There is a link to the GitHub repository folder for the icons.

PJ requested that we embed images, linked to external files, next to each relation, but this may not work because the link view images on the SVN does not appear to work. Also, one is generally required to upload an image before it can be embedded on a wiki page. Tried to set it up using html but the <img> tag did not work on the wiki.


There is a link to relations page from icons on live browser.

Should share the icons with OBO Foundry, so they can be used consistently across ontologies.

JE: I added the images for this next to each relation section. Links go to Amigo icons, not from svn as that doesn't work. Had to enable allow external images in mediawiki. No need for <img> tag. Just put the link to images you want embedded. These do not seem to be working


Note: As of Aug. 2015 the icon file is located in the Planteome GitHub Repository/common files

Preparing Ontology Files for Release:

Curators names in definitions, etc

  • Curators names in definitions and synonyms should be spelled out in full example: GR:Pankaj_Jaiswal and these are not links, unless they are POC.
  • POC curators should be listed on the wiki page POC curators
  • Any reference that begins with POC (e.g., POC:curators, POC:wood_curators, POC:Maria_Alejandra_Gandolfo), will link to the Plant_Ontology_curators page.
  • If the definition is worked out as a group, then the source should be POC: curators. This should link to a page where all the curators of the POC are listed.

Updating the translations

  • JE has a script to insert them. As of Dec 2012, the insert_translations script is inserting extra spaces after the Japanese name. JE fixed it manually, Note that this only temporarily fixes the obo file. Still need to find out why it is happening.

This worked well. Then they were manually inserted into the OBO file. This took time.

Po-Refs Page

- Links to references that do not have a PubMed ID, ISBN number, or stable URL should go to the new PO references page

Dev version

Between releases, editing is done on the developers version of the plant-ontology.obo file located at: plant-ontology@GitHub.

  • plant-ontology.obo

This is the "developers" version and is loaded onto the PO-Dev Browser nightly, along with the GitHub version #. No annotations are loaded on this browser.

During the preparation of the release, editing may continue on the dev file, without disrupting the version on the live browser or the beta browser (see below).

Once the initial Quality Checks (see above) are completed, the plant-ontology-assert.obo file can be created (see below).

The "plant-ontology-assert.obo" file can be loaded onto the Beta Browser, along with annotation files so we can check for any problems.

  • Note: We should also put the revised PO front page on the Beta Browser for review, prior to publishing.

If a problem is found that needs to be fixed for the release:

In the event that a problem is found in the beta version prior to the release, the fix should be made in the DEV version and the process of generating the additional files and loading the beta browser should be repeated anew.

This ensures that the fixes are not lost from the editor's version in the process of the release. This is why it is important for the editors to do a thorough review and run QC checks prior to creating the "asserted file" and loading it on beta.

Once the editing and quality checks are completed for the release, the following additional versions of the Plant Ontology file are prepared.

This need be done prior to the release as plant-ontology-assert.obo is the file loaded onto the AmiGO browser. (see below for details)

Files created for release based on the 'final' version of plant-ontology.obo

  • plant-ontology-assert.obo
  • plant-ontology-anatomy.obo
  • plant-ontology-temporal.obo


Note: These additional files should only be generated once the review process is completed. If an ontology problem is found, these will have to be recreated (see below:)

Details of the alternative files for the release

plant-ontology-assert.obo

This version is the same as plant-ontology.obo with all nonredundant implied links asserted (added) by a reasoner,

  • Use the OBO-Release manager Oort, as the "assert implied links" panel in OBO-Edit does not work very well. Go here for more information about configuring Oort

To create plant-ontology-assert.obo:

1. Create a new temp folder on your hard drive for the files and some subfolders created by Oort

2. Open the desktop GUI for Oort OBO-Release manager

3. Browse to the latest version of plant-ontology.obo as the input file

4. Specify the folder you created for the output- browse to it.

5. On the "Advanced" tab, uncheck all options.

6. Select only "OBO" as a Write Format

7. Choose either the HermiT or the Pellet reasoner (results should be the same)

8. Click 'run'

9. Rename the output file "po.obo" to "plant-ontology-assert.obo", and save it in the appropriate repository on GitHub

10. Use git add, git commit, and git push to upload to GitHub. As of Aug. 2015 Git and Github are being used for version control.

  • Note that unlike OBO-Edit, Oort also asserts the relations that are specified in any intersection_of lines. For example, if leaf sinus is defined as intersection_of is_a sinus and intersection_of part_of leaf, Oort will assert "leaf sinus part_of leaf".
  • "Assert implied relations" in OE will NOT do this, but the relation will show up in graphical view if the reasoner is on.

plant-ontology-assert-basic.obo

This version is the same as plant-ontology-assert.obo, but with only the is_a and part_of relations. This file is created using a filtered save in OBO-Edit.

To create plant-ontology-assert-basic.obo in OBO-Edit:

1. Open plant-ontology-assert.obo from the Planteome GitHub po-release-files folder

2. Select 'save as' and use 'Advanced Save'

3. Create a new save path: Set the save path to .....plant-ontology/po-release-files/plant-ontology-assert-basic.obo

4. Check "Filter links" only

5. Set up 2 filters with "Matches any":

  • Find links where: "Type" "have" a: "Any text field" that: "contains", the value: "is_a"
  • Find links where: "Type" "have" a: "Any text field" that: "contains", the value: "part_of"

6. Select: "Greedy root selection algorithm" and "Don't write current ID rules" (but it doesn't matter for the substantive parts of the file.)

7. Add a comment "Filtered from plant-ontology-assert.obo to have only is_a and part_of relations. Matches plant-ontology.obo Release version ####."

7. Save and re-open the file in OBO-Edit- You will see a warning that a number of terms have exactly one intersection_of relation. These are the cross products that were set up using participates_in, has_participant, or develops_from.

Note in version #20, there were only 7 of these:

  • flower development stage (PO:0007615): collective plant organ structure development stage (PO:0025338) has_participant flower (PO:0009046)
  • gametophyte meristematic apical cell (PO:0030014): meristematic apical cell (PO:0030007) participates_in gametophyte development stage (PO:0028003)
  • plant embryo stage (PO:0007631): sporophyte development stage (PO:0028002) has_participant plant embryo (PO:0009009)
  • plant spore stage (PO:0025375): gametophyte development stage (PO:0028003) has_participant plant spore (PO:0025017)
  • primary vascular tissue (PO:0025408): portion of vascular tissue (PO:0009015) develops_from procambium (PO:0025275)
  • secondary vascular tissue (PO:0025409): portion of vascular tissue (PO:0009015) develops_from vascular cambium (PO:0005598)
  • sporophyte meristematic apical cell (PO:0030015): meristematic apical cell (PO:0030007) participates_in sporophyte development stage (PO:0028002)


  • Note in version #21, there were only 4 of these:
    • gametophyte meristematic apical cell (PO:0030014): intersection_of: PO:0030007 ! meristematic apical cell
    • plant spore stage (PO:0025375): intersection_of: PO:0028003 ! gametophyte development stage
    • secondary vascular tissue (PO:0025409): intersection_of: PO:0009015 ! portion of vascular tissue
    • sporophyte meristematic apical cell (PO:0030015): intersection_of: PO:0030007 ! meristematic apical cell


8. Removing those links in OBO-Edit or can also be removed manually, in the text file. They aren't needed, since all implied links have already been asserted.

  • In OBO-Edit: Set up a filter with "Matches any": Find links where: "Self" "don't have" a: "Is intersection" that: "contains", the value: leave blank

see: Link Filtering

9. Save and re-open the file- the error messages should be gone.

Separate Aspect files:

See discussion on POC_Conf._Call_4-17-12

We will keep generating these and posting them on Bioportal through the release #18. After that people will have to come to GitHub to obtain them.

  • Note: Bioportal should only serve the current release version: plant-ontology-assert.obo
  • These still be on Bioportal as "legacy data" (not sure what this means)

Separate aspect files: Alerted users that after release #18, we will not be offering these on Bioportal any longer. See notes of 4-17-12 meeting

These files were originally created after the change in January 2011, as our collaborators at TAIR still needed the separate files of the ontology. Since they preferred the 'basic' form of the ontology file, these were derived from the plant-ontology-assert-basic.obo file.

Thus these really should be called po_anatomy_assert_basic.obo and po_temporal_assert_basic.obo. but if this change is made, must fix links on download page and if any users are accessing it.

  • Who else is still using these files?

Check with Gramene and TAIR and SGN to see if they are still using those files

8-20-13

  • TAIR is not using these any more, they are moving to use the OWL file.

For other users such as SGN, they would prefer to have all the relations in the files, except for the ones that go between the two branches (ie: participates_in).

  • SGN: waiting for reply
  • Gramene is linking to: plant_ontology_assert.obo on live tag
    • Note: Aug 2013 release #20: These appear to have been automatically uploaded to Bioportal, I received a notice, but I did not submit the files there. PURLS are not working. Send message to Bioportal support.


po-anatomy-assert-basic.obo

  • This file is created by filtering plant-ontology-assert-basic.obo to contain only terms from the plant anatomical entity branch. Instruction below describe how to do this in OBO-Edit, but in theory it could be done using Oort as well.

To create po-anatomy-assert-basic.obo in OBO-Edit:

1. Open plant-ontology-assert-basic.obo and save the file as "plant-ontology-assert-basic-TEMP.obo" or some dummy name

2. Manually remove any replaced_by or consider relations that go between the two branches of the ontology (these include: gametophyte, seedling, sporophyte).

  • gametophyte (PO:0009004); replaced_by: PO:0028003 gametophyte development stage
  • sporophyte (PO:0009003) replaced_by: PO:0028002 sporophyte development stage
  • seedling PO:0008037; replaced_by: PO:0007131 seedling development stage

3. Save it again

4. Open advanced save dialog and set the save path to .../po-anatomy-assert-basic.obo

5. Check "Filter terms" only

6. Set up filter to "Have, namespace, contains, plant_anatomy"

7. Check "always save properties"

8. Add a comment "Filtered from plant-ontology-assert-basic.obo to have only plant anatomical entity terms. Matches plant-ontology.obo version ####."

"Filtered from plant-ontology-assert-basic.obo to have only only plant anatomical entity terms for Plant Ontology Release version 21."

9. Save and check by re-opening OboEdit

po-temporal-assert-basic.obo

  • This file is created by filtering plant-ontology-assert-basic.obo to contain only terms from the plant structure development stage branch. Instruction below describe how to do this in OboEdit, but in theory it could be done using Oort as well.

To create po-temporal-assert-basic.obo in OboEdit:

  • Open the temp version of plant-ontology-assert-basic.obo with any replaced_by or consider relations that go between the two branches of the ontology (from above).
  • Open advanced save dialog
  • Set the save path to .../po-temporal-assert-basic.obo
  • Check "Filter terms" only
  • Set up filter to "Have, namespace, contains, plant_structure_development_stage"
  • Check "always save properties"
  • Add a comment "Filtered from plant-ontology-assert-basic.obo to have only plant structure development stage terms. Matches plant_ontology.obo version ####."
  • Save and check by re-opening OboEdit

Creating it from plant_ontology_assert.obo

    • Note: We decided not to generate these for Release#21, but can add later if needed.

1. Open plant_ontology_assert.obo in OBO-Edit

2. Open advanced save dialog- set the save path to .../po_temporal.obo

3.check "Filter terms": set up filter to "Have, namespace, contains, plant_structure_development_stage"

4. check "always save properties"

5.add a comment "Filtered from plant_ontology_assert.obo to have only plant structure development stage terms. Matches plant_ontology.obo version ####."

6.Save and check by re-opening OBO-Edit- if there are dangling cross-branch relations, you will get an error

7. Manually remove any replaced_by or consider relations that go between the two branches of the ontology (these include: gametophyte, seedling, sporophyte).

8. Manually remove any intersection_of relations: Even though these are not pointing at the anatomy branch, they cause an error

  • flower development stage (PO:0007615)
  • plant embryo stage (PO:0007631) generated 1 warning:
  • plant spore stage (PO:0025375) generated 1 warning:


  • Note: If these files are created from plant_ontology_assert.obo you will have to remove these relations that cross the two branches, addition to the ones above.

Text (tab-delimited) version files:

These files contain the PO:ID, term name, the definition, any synonyms and the aspect (in po_ontology.txt)

Oct 2011: It would also be good to have a column listing the alt_ids.

Note: These files also include the obsoleted terms. This is indicated in the definition field of those terms and in some cases, in the term name. Any terms that have "obsolete" in their name are terms that have the same name as a live term.

The "obsolete" was added because older versions of OBO-Edit did not let you have two terms with the same primary name. I think it does now, but we still kept the practice, just so people wouldn't accidentally use the obsolete term that has the same name.

Note: These are generated from the MySql database directly and not from the obo files. The perl script used to generate the .txt files is located in the plant-ontology/scripts-etc/ GitHub page

plant_ontology.txt (was po_all.tbl)

This should be called plant_ontology.tbl Note: when this change is made, we will need to fix the link from the download page

Also need to change it in the script

changed from po_all.tbl to plant_ontology.txt on 8-21-13

po_anatomy.txt and po_temporal.txt

po_synonyms.txt

what is this file?

This file list all PO:ids with their name, Spanish/Japanese synonyms, and definition.

Location of tab delimited files

OWL versions

OWL Version of the editor's plant-ontology file:

  • April 2013 (SVN # 1840), a new chron job was intiated that creates a owl file from dev every night.
  • Creation is done by executing the command "obolib-obo2owl plant_ontology.obo -o plant_ontology.owl" (Is this correct?)
  • This file then needs to be hand edited to include the version info. The following line goes right above the </owl:Ontology> line:

<owl:versioninfo>xml:lang="en" version ##</owl:versioninfo>

where ## is the version number.

OWL Version of the Live plant-ontology file (release version):

*As of Aug. 2015, the plant-ontology.owl file is located in the Planteome GitHub Repository/po-release-files

  • plant_ontology.owl (This corresponds to plant_ontology.obo from the live tag)

An OWL version of the ontology file is generated for each release, which is located in the plant-ontology/po-release-files repository at: GitHub.

This file is generated from the plant_ontology.obo file at the time of the release, and subsequently copied to GitHub.

Only the plant_ontology.owl file is moved over to GitHub.

  • We need a 'Readme' file for this folder and some notes on how this is generated

PO Anatomy Glossary

Plant Anatomy Glossary

The Plant Anatomy glossary is linked to the live database and updates automatically, but it is a good idea to check it after the database is switched.

Updating the "Other" Browsers

Recently, we have created a few other browsers that will need to be updated after each release:

note: This browser no longer exists

Quality Checks and Reviews:

Before each release, ontology editors and curators should run some qc checks. In fact, these should be done on a regular basis in between releases, but it is crucial to do them immediately before a release.

  • Run the reasoner to remove any redundant links.
  • Do a search for extra space and odd characters in terms, definitions and dbxrefs (see above).
  • Check for any lines in the existing association files that reference obsolete or alternate IDs (see above)
  • update the translation files
  • Add any new DbxRefs and and fix any broken links
  • Update any annotation files

Located_in: see embryo sac (PO:0025074)

Internal Reviews

  • Once we have completed the editing, the Ontology should be open for review on the Beta browser
  • The whole POC team should take a little time to do a internal review. Look for any inconsistencies, typos, broken links etc.
  • This may occur with or without the full set of annotations. Depending on what we want to test. (see below for more details on Beta)

Inviting Specific External Reviewers

For some releases, we may ask specific experts and or collaborators to take part in an External Review. This was done in Aug 2010 for the October 2010 release. The external experts will need at least a couple of weeks and up to one month.

To facilitate the review, the POC may need to make a presentation to them, such as the Plant_Ontology_Webinar-_May_2011_release PO-Physcomitrella presentation, to feature the new terms for non-vascular plants, in particular, Physcomitrella patens.


Add more details here of the external review process...

Beta Release Announcement

An announcement should be sent out on the mailing lists something like this:

"A beta version of the latest release of the PO is now available on our AmiGO Browser (http://beta.plantontology.org/amigo/go.cgi). Along with the Ontology itself the beta browser also is loaded with some test annotation files (not the full set of files at this point).

You can download the ontology files in OBO-Edit format from GitHub

There is a readme.md file on the same page explaining what the is different among the files available.

Further information on the upcoming release can be obtained from: Links to Release Pages: "Summary of Changes" and "New and Obsoleted Terms"

Please review the Plant Ontology and let us know asap about any important issues you observe."

Loading the Live Version

Once the review and internal checks and fixes are completed, the plant_ontology_assert.obo file and the whole database of annotations are loaded from the beta branch onto the beta browser.

Note that it is the "plant_ontology_assert.obo " file that should be loaded onto AmiGO browser on the live site.

Once the release is live on the Plant Ontology page, the ontology files should be copied from the beta branch to the po-release-files folder on GitHub.

This is a stable url that will not change, so it can be incorporated into users scripts etc.


In the October 2011 release, we introduced Japanese translations of the term names. Currently, JE is stripping out the Japanese synonyms before he loads the live version onto AmiGO. We had discussed whether or not to remove them from the live release and provide a separate file which included the translated terms, but that was rejected (meeting date: fill in date...). We have had complaints from some of our user groups (eg. Gramene) that their browsers cannot display the Japanese characters . We should reconsider this approach.

Changes to the HTML Files for the PO web site:

Prepare "Release Notes" page

  • Each Release and any Interim Data Release should have a unique Release Notes webpage on the PO site detailing the current statistics of the release.
  • The old pages are archived Release Notes Archive so that we can keep the track of the changes.


Update PO front page and other pages

  • Post release announcements
  • Update any "Upcoming Events", move older event to the archive page


  • Update the "Filter Annotation Objects Counts" for new species, data types in release


This good time to incorporate any changes or fixes to the web pages:

List needed fixes here:

  • Documentation page: replace link to old dbxrefs file with a link to the current one on the svn.
  • Documentation page doesn't exist anymore.

Announcements

  • Prepare and post announcements for the PO front page, Jaiswal Lab Page, FB page
  • Send out announcements to the mailing lists

other places??


Updating the Plant Ontology files on other sites:

The two main sites where people go to download the PO are:

Also Ontobee, but that pulls the file from http://purl.obolibrary.org/obo/po.owl, so it should always be up to date.

Also featured at:

  • TAIR:

See this page for links to other sites that feature the PO: Links_to_sites_using_the_PO

Updating the "Filter Annotation Objects" lists on the AmiGO browser:

  • If annotations from new database groups or species are added, we should update the "Filter Annotation Objects" list on the browser

These lists should match what is in our database; as listed on the release notes page.

LC: new in V#16: Jaiswal Lab: Fragaria vesca (strawberry) and Gramene: Physcomitrella

In Jan 2012 16A release, we added a large set from Physcomitrella from Cosmoss. Mar 2012 release will include new ones from AgBase for cotton (Gossypium spp.) and new species for SGN.

We also need SGN germplasm which has 4503 annotations

  • It would be good to be able to filter by species

JE: This is handled by the freez_hash_misc_keys script. Make the changes in that file, run the script and copy the created file to the cgi-bin/plantontology directory.