Skip to content

ECTO Content, Structure, and Quality Control

Lauren edited this page Aug 19, 2022 · 2 revisions

ECTO Content ECTO contains a variety of precomposed exposure classes, which were included in the ontology due to their documented effects in the literature, their need for various use cases, and their intended ability for reuse in future applications. While many potential exposure terms may be developed (e.g. exposures for all chemicals in ChEBI), precomposed terms should focus on those which are meaningful for reuse, and direct application in use cases. For more specific exposures, we plan to leverage our developing annotation model to support use cases, without inflating the ontology.

Term requests are typically documented and tracked using the Issues tracker available through GitHub, where ECTO documentation is managed. (https://github.com/EnvironmentOntology/environmental-exposure-ontology/issues) Any individual with internet access and a free GitHub account is able to submit an ECTO term request for their needs. The Issues tracker can also be used for communication with other researchers, ontology curators, and community users to ensure ECTO terms are accurate and appropriately developed for maximal usage. In addition to community requests, ECTO curators also utilize the Issues board for planning and documenting current and future plans for ECTO. For example, environmental exposure questionnaires and surveys are commonly used to estimate human exposures to coordinate with health outcomes. Prominent environmental exposure surveys, including those from the Undiagnosed Disease Network22 and the Environmental Polymorphisms Registry23, have been highlighted as meaningful sources of common human exposures that could be modeled in ECTO. Open resources are also utilized to support accurate exposure class development. For example, resources from the Occupational Safety and Health Administration (OSHA) are used to inform occupational exposure classes. Running projects and evaluations of surveys and other resources can be found in the Issues tracker, and continued work towards curating these common exposures will be documented in this location. Beyond the Issues tracker, the Monarch Initiative hosts a ‘Computable Exposures’ electronic mailing list that provides periodic updates regarding ECTO to stakeholders and community contributors.

ECTO ID Structure and Maintenance Currently, ECTO contains 2763 terms generated using Dead Simple OWL Design Patterns (DOSDP). Within ECTO, the naming convention for Term IDs includes the abbreviation ‘ECTO:’ followed by an 8 digit numeric value. All original terms/classes in ECTO have a unique Term ID that is independent from any other identifier that may be associated with that class outside of ECTO. All terms in ECTO are dereferencable and can be searched using each term’s distinct purl link (e.g. http://purl.obolibrary.org/obo/ECTO_9000194). Currently, ECTO is primarily being curated using DOSDP. While multiple curators have developed terms, within ECTO, each DOSDP is given a specific Term ID number set (e.g., ECTO: 900XXXXX refer to the pattern ‘exposure to chemical’) allowing for patterns to be continuously added to within the ID range without a disruption in the numeric naming strategy. For metadata, all ECTO terms are required to have an ‘entity label’ including classification in the ECTO hierarchy. It is also highly recommended to include a natural language definition, synonyms, a logical definition, an axiomatic definition, synonyms, and attributions to the creator(s). As entity terms are added to ECTO, it is likely existing terms in ECTO may change and may be considered for deprecation. Through this process, existing terms may either be made obsolete without incorporating the term information or it may be made obsolete while being merged with another term. Initially, this process begins with a report on the ECTO GitHub Issues tracker in which the term(s) in question is highlighted as a candidate for obsoletion. Some examples of when this might take place is 1) duplicate terms exist within ECTO, 2) the term does not meet the requirements for inclusion in ECTO (is not an environmental condition, treatment, or exposure). Once the term in question has been deemed appropriate for deprecation, the term can be made obsolete by adding the term in question to the appropriate file (obsolete.tsv or replaced.tsv) within the ECTO GitHub repository. The obsoletion will come into effect following a subsequent new release of the ontology.

Quality Control Given the success in ECTO’s application in projects such as the EPR project and the indicated need for standardized toxicology and exposure terminologies previously established, ECTO is well positioned to provide content in this area. When curating content, both the evidence and annotations are fully reviewed by an expert biocurator prior to inclusion. Additionally, annotations can be supported by GO Evidence Codes seen in the Evidence and Conclusions Ontology, alongside a reference. Annotations follow the GO Annotation File (GAF) format21, with the appropriate ECTO annotations file being housed within the ECTO GitHub repository. To assure quality of ECTO content, a variety of logical consistency checks are embedded into the ontology as it was developed using the ODK. All updates to the ontology are facilitated through GitHub, which maintains versioning control and houses the Travis quality control assessment tools. Our curators also closely follow ECTO’s status on the OBO Foundry Dashboard reports to ensure our ontology is functional and satisfactory based on OBO Foundry principles (http://dashboard.obofoundry.org/dashboard/ecto/dashboard.html).

Clone this wiki locally