TOWARDS CLASSIFYING ORGANISMS BASED ON THEIR PROTEIN PHYSICOCHEMICAL PROPERTIES USING COMPARATIVE INTELLIGENT TECHNIQUES

Author： Banerjee Amit Kumar Harikrishna Nayanoori Kumar Jangam Vikram Murty Upadhyayula Suryanarayana

Publisher： Taylor & Francis Ltd

ISSN： 1087-6545

Source： Applied Artificial Intelligence, Vol.25, Iss.5, 2011-05, pp. : 426-439

Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.

Previous Menu Next

Abstract

Several supervised and unsupervised methods are presently available for classification and clustering extremely nonlinear data sets. Biological data sets are known to be complex in nature due to their greater dimension, complex attribute interactions, and dynamic behavior. In this article, we present the classification of 16 organisms based on physicochemical properties of their proteins employing comparative intelligent techniques. Considering the complexity of the present working data set, an attempt has been made to select the most important attributes using the feature selection facility available in TANAGRA (http://eric.univlyon2.fr/∼ricco/tanagra/en/tanagra.html) for better classification efficiency. Various methods available in LIB-SVM, a library for support vector machines, Waikato Environment for Knowledge Analysis (WEKA), and Konstanz Information Miner (KNIME) were utilized. Support vector machines (SVMs), radial basis function (RBF), polynomial, multiple layer perceptron (MLP), hyper-tangent, and sequential minimal optimization (SMO) were adopted to achieve maximum accuracy in the results. The best results obtained (>70%) are compared.