Scoring and summarising gene product clusters using the Gene Ontology

Author: Denaxas Spiridon C.   Tjortjis Christos  

Publisher: Inderscience Publishers

ISSN: 1748-5673

Source: International Journal of Data Mining and Bioinformatics, Vol.2, Iss.3, 2008-09, pp. : 216-235

Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.

Previous Menu Next

Abstract

We propose an approach for quantifying the biological relatedness between gene products, based on their properties, and measure their similarities using exclusively statistical NLP techniques and Gene Ontology (GO) annotations. We also present a novel similarity figure of merit, based on the vector space model, which assesses gene expression analysis results and scores gene product clusters' biological coherency, making sole use of their annotation terms and textual descriptions. We define query profiles which rapidly detect a gene product cluster's dominant biological properties. Experimental results validate our approach, and illustrate a strong correlation between our coherency score and gene expression patterns.