UU Cheminformatics Journal Club: March 2010

The UUCJC is picking up slowly, trying to find the exact format. We already had to slip discussion of the first meeting because I had a nasty cold, and we did not manage to complete reviewing the SMSD paper yesterday.

The aims of the journal club include to:

learn to critically read scientific papers
learn about important cheminformatics algorithms
formulate a review report

We keep a schedule of alternating discussion of three topics:

data and knowledge representation,
cheminformatics algorithms,
data analysis,
learn your way around in modern publishing.

Meeting format
Everyone is free to join in on one or more meetings, but if you do join in on a meeting, it is expected that you take part in the discussions. If you are not interested in some parts of the discussion, you are free to express this; the order in which we discuss things in the paper is flexible. The meeting itself has a fixed format:

a round where every one in two minutes:
1. summarizes the paper,
2. indicate what he found interesting and what not (affecting the order in which things will be discussed during the meeting),
3. an overall (preliminary) verdict on the paper.
walkthrough of the paper, paragraph by paragraph (affected by people's interest)
formulation of the UUCJC opinion of the paper.

During the meeting, notes will be made, and after the meeting one or two participants will write up the formulated opinion of the paper for inclusion in this blog.

Social Web
The review for the first paper we discussed is now online, and I added a box down the end as expected by ResearchBlogging.org (see also this blogs entry). Because this blob also contains the DOI, it will also be picked up by Chemical blogspace, Nature.com Blogs, and possibly other social website. It will also be used by PLoS ONE to calculate article level metrics to reflect a paper's impact. Here is such the PLoS One metrics summary for a paper we will discuss later.

Updated Schedule

Next dates meeting dates are March 25 (we'll finish the SMSD paper), April 8 and 22, May 6 and 20. The papers we have lined up:

Virtual screening of bioassay data, doi:10.1186/1758-2946-1-21
ChemAxiom – An Ontological Framework for Chemistry in Science. doi:10.1038/npre.2009.3714.1
How Large Is the Metabolome? A Critical Analysis of Data Exchange Practices in Chemistry. doi:10.1371/journal.pone.0005440
pKa Prediction of Monoprotic Small Molecules the SMARTS Way, doi:10.1021/ci8001815

The article, Towards pharmacogenomics knowledge discovery with the semantic web, written by Michel Dumontier and Natalia Villanueva-Rosales attempts to demonstrate the importance of pharmacogenomics and how the data should be structured in the best possible way. Their strategy towards knowledge discovery involves ontology design, population and question answering. In a more specific manner this was established with Web Ontology Language OWL-DL, Protégé and Manchester OWL Syntax.

With the SO-Pharm project as inspiration the authors attempts to improve the area of pharmacogenomic by reducing classes and instances and focus on better relations creating a not so complex base. In this manner the authors intended to extend the existing knowledge base PharmGKB with OWL-DL. The area of application lies within personalized medicine where knowledge around depression is their main field. Hence the management system Protégé is used with Manchester OWL Syntax as query language to ease the use for doctors, researchers and patients.

The methods used for the design of the knowledge base (PO) is described in a very detailed way but unfortunately this is quite hard to come to grips with as a rookie in the field. Their aim for the design was to create a ground where it would be easy for researchers and doctors to use the system.

The authors discuss the use of XML in the semantic web. Sadly they do so in a less then awesome way, claiming that XML lack semantics, and then later in the article they use the built in semantics of XML for doing automated converting of data stored in XML. Although the authors probably have a point in that the semantics of RDF and OWL is more explicit and powerful it is important to remember that XML is a syntactic language while OWL is a data model. The OWL data model can be stored in XML. Thus just stating that XML lack semantics is a bit like saying that a book does not contain a good chaptering -- it is not enforced but it sure could contain it... There are many data formats based on XML and the point here is that they can be very different and it is difficult to say general things about XML in this way.The use of XML2OWL for converting from the XMLS of PharmGKB to OWL highly states that XML do have semantics.

The way that they use the words class and concepts creates greater confusion since class and concept is the same in ontology. Sometimes the word concept is referred to it self and other times as class. “... representation containing 70 core classes and over 40000 concepts..” and further into the text “... Pharmacogenomics Ontology identifies 40 core concepts...”.

In the area of relations there must be further development. Even though many relationships are correctly described with the Basic Relation Ontology (BRO) some relationships are only described as bro:isRelatedTo, because of the low level of semantics. But in order to create an accurate base for knowledge retrieval where all queries are possible this must be managed.
The ambition of the research is of importance to a brighter and more knowledge-filled future of personalized medicine. It might have a bit further to go but at least the aim is in the right direction.

Annsofie Andersson and Jonathan Alvarsson

Dumontier, M., & Villanueva-Rosales, N. (2009). Towards pharmacogenomics knowledge discovery with the semantic web Briefings in Bioinformatics, 10 (2), 153-163 DOI: 10.1093/bib/bbn056

UU Cheminformatics Journal Club

Followers

Blog Archive

Friday, March 12, 2010

Three new papers lined up

Thursday, March 11, 2010

Review of Towards pharmacogenomics knowledge discovery with the semantic web