Thursday, March 11, 2010

Review of Towards pharmacogenomics knowledge discovery with the semantic web

The article, Towards pharmacogenomics knowledge discovery with the semantic web, written by Michel Dumontier and Natalia Villanueva-Rosales attempts to demonstrate the importance of pharmacogenomics and how the data should be structured in the best possible way. Their strategy towards knowledge discovery involves ontology design, population and question answering. In a more specific manner this was established with Web Ontology Language OWL-DL, Protégé and Manchester OWL Syntax.

With the SO-Pharm project as inspiration the authors attempts to improve the area of pharmacogenomic by reducing classes and instances and focus on better relations creating a not so complex base. In this manner the authors intended to extend the existing knowledge base PharmGKB with OWL-DL. The area of application lies within personalized medicine where knowledge around depression is their main field. Hence the management system Protégé is used with Manchester OWL Syntax as query language to ease the use for doctors, researchers and patients.

The methods used for the design of the knowledge base (PO) is described in a very detailed way but unfortunately this is quite hard to come to grips with as a rookie in the field. Their aim for the design was to create a ground where it would be easy for researchers and doctors to use the system.

The authors discuss the use of XML in the semantic web. Sadly they do so in a less then awesome way, claiming that XML lack semantics, and then later in the article they use the built in semantics of XML for doing automated converting of data stored in XML. Although the authors probably have a point in that the semantics of RDF and OWL is more explicit and powerful it is important to remember that XML is a syntactic language while OWL is a data model. The OWL data model can be stored in XML. Thus just stating that XML lack semantics is a bit like saying that a book does not contain a good chaptering -- it is not enforced but it sure could contain it... There are many data formats based on XML and the point here is that they can be very different and it is difficult to say general things about XML in this way.The use of XML2OWL for converting from the XMLS of PharmGKB to OWL highly states that XML do have semantics.

The way that they use the words class and concepts creates greater confusion since class and concept is the same in ontology. Sometimes the word concept is referred to it self and other times as class. “... representation containing 70 core classes and over 40000 concepts..” and further into the text “... Pharmacogenomics Ontology identifies 40 core concepts...”.

In the area of relations there must be further development. Even though many relationships are correctly described with the Basic Relation Ontology (BRO) some relationships are only described as bro:isRelatedTo, because of the low level of semantics. But in order to create an accurate base for knowledge retrieval where all queries are possible this must be managed.
The ambition of the research is of importance to a brighter and more knowledge-filled future of personalized medicine. It might have a bit further to go but at least the aim is in the right direction.

Annsofie Andersson and Jonathan Alvarsson

ResearchBlogging.orgDumontier, M., & Villanueva-Rosales, N. (2009). Towards pharmacogenomics knowledge discovery with the semantic web Briefings in Bioinformatics, 10 (2), 153-163 DOI: 10.1093/bib/bbn056

1 comment:

  1. Hi!
    It's great that you took the time and effort to review our paper :-) I would like to address some of the points that you've argued.
    First, XML is a language to represent information, just as RDF/OWL does, and has a data model (as per its defined syntax and semantics). The XML data model is a (potentially ordered) set of nodes, untyped relations and literals. Like RDF/OWL, elements are typically defined in some namespace, thereby allowing us to extend the model with elements that have certain meaning. While XMLS allows us to constrain the document to a certain elements and attributes, we are unable to provide more formal meaning to the relationships that exist between them. This is where RDF/OWL come into play. Now, with their enriched, formally defined vocabulary, we can design documents where the meaning of the elements can be more formally expressed. While RDF can be expressed using the XML syntax, the meaning of those elements are formally described using RDF's vocabulary, or in our case, using OWL.
    As for class/concept, there is an important distinction. A class is in the context of OWL semantics where as a concept is in the context of a conceptualization of the domain. It is the case that our concepts are formally expressed as OWL classes.