Mining microarray gene expression data with unsupervised possibilistic clustering and proximity graphs

Author: Romdhane L.   Shili H.   Ayeb B.  

Publisher: Springer Publishing Company

ISSN: 0924-669X

Source: Applied Intelligence, Vol.33, Iss.2, 2010-10, pp. : 220-231

Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.

Previous Menu Next

Abstract

Gene expression data generated by DNA microarray experiments provide a vast resource of medical diagnostic and disease understanding. Unfortunately, the large amount of data makes it hard, sometimes even impossible, to understand the correct behavior of genes. In this work, we develop a possibilistic approach for mining gene microarray data. Our model consists of two steps. In the first step, we use possibilistic clustering to partition the data into groups (or clusters). The optimal number of clusters is evaluated automatically from the data using the Information Entropy as a validity measure. In the second step, we select from each computed cluster the most representative genes and model them as a graph called a proximity graph. This set of graphs (or hyper-graph) will be used to predict the function of new and previously unknown genes. Experimental results using real-world data sets reveal a good performance and a high prediction accuracy of our model.

Related content