Skip to content

UK Spelling Synonym generation pipeline

Nicole Vasilevsky edited this page Jan 28, 2020 · 2 revisions

The HPO has its own UK spelling synonym generation pipeline that can be found in the extended Makefile. It can be invoked as follows:

cd src/ontology
sh run.sh make add_british_language_synonyms

The process works as follows:

  1. Query hp-edit.owl for all HP terms and their labels (using a special SPARQL query and ROBOT query)
  2. Query hp-edit.owl for all HP terms and their synonyms
  3. Generate British synonyms using a custom python script that loads the above synonym and label lists and a dictionary (which is maintained by the HPO editor - so it can be extended if need be). This works by first generating UK spellings for all labels and all synonyms and them removing those for which an (exact) synonym already exists. The python scripts exports the resulting UK spelling TSV as a fully functional ROBOT template.
  4. Compile the ROBOT template with the new synonyms into OWL (british_synonyms.owl).
  5. Merge the british_synonyms.owl file into the edit file.

After the pipeline is run, the responsible editor should review the list of suggested synonyms (using the ROBOT template csv file, which is easier to read), and then create a Pull Request to make sure no errors were introduced. A frequent error is a generated synonym is actually already there - just not as an exact synonyms (perhaps a relatedSynonym) (therefore this should not be committed to master straight away).