Skip to content
hchiba1 edited this page Oct 13, 2017 · 7 revisions

Harnessing the Orthology Ontology for inferring new information

Hirokazu Chiba, Tarcisio Mendes de Farias

Description: The shared genes among different species are evidence of evolution from a common ancestor. For example, we share approximately 90% of our genes with mice. These related genes are called orthologs. Currently, there are more than 50 Orthology databases available. In this context, end-users are typically interested in pairwise relationships (e.g. is orthologous to). However, today’s orthology information providers store all pairwise relationships, which grow quadratically with the number of genes or genomes. To address this problem, we propose a two-step approach. Firstly, we will extend and adapt the ORTH ontology to be compliant with Description Logics for the sake of decidability and availability of reasoning tools. Secondly, we capture the implicit information of pairwise relationships with an inference engine. This information is implicitly structured in Hierarchical Orthologous Groups. In doing so, the data to be stored and retrieved scales linearly. For example, we do not need to store pairwise orthologs between species because they can be inferred by applying a Horn-like rule (or by using some query rewriting approach). By doing so, we avoid the materialization of billions triples.

Goals

  • Description Logic compliant orthology ontology to take advantage of automatic reasoning tools
  • Define a set of logical rules or stored queries for a query rewriting approach to avoid materialization

Midterm wrap-up

  1. Updated the Orthology Ontology (reconsidered the properties, and checked if the domains and ranges are defined properly, and made 22 main modifications)
  2. Defined a draft version of a Life Science Cross Reference Ontology to enhance interoperability among life science databases
  3. Started a definition of a query set and a correspondent logical rule set to infer Orthology Ontology property assertions such as hasOrtholog, hasParalog, hasXenolog, hasInParalog, hasOutParalog and hasHomolog.

Final wrap-up

Made 27 modifications

  • Added pairwise relations
  • Checked if the domains and ranges are properly defined
  • Fixed other issues on the classes

Converting from original orthology datasets

  • OMA, and discussion about InParanoid

Inference from hierarchical data

  • SPARQL sub-queries for inferring ortholog, paralog, Inparalog and outparalog

Interoperable cross references

  • Necessary for integrative search