

Author: Banerjee Amit Kumar Harikrishna Nayanoori Kumar Jangam Vikram Murty Upadhyayula Suryanarayana
Publisher: Taylor & Francis Ltd
ISSN: 1087-6545
Source: Applied Artificial Intelligence, Vol.25, Iss.5, 2011-05, pp. : 426-439
Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.
Abstract
Several supervised and unsupervised methods are presently available for classification and clustering extremely nonlinear data sets. Biological data sets are known to be complex in nature due to their greater dimension, complex attribute interactions, and dynamic behavior. In this article, we present the classification of 16 organisms based on physicochemical properties of their proteins employing comparative intelligent techniques. Considering the complexity of the present working data set, an attempt has been made to select the most important attributes using the feature selection facility available in TANAGRA (http://eric.univlyon2.fr/∼ricco/tanagra/en/tanagra.html) for better classification efficiency. Various methods available in LIB-SVM, a library for support vector machines, Waikato Environment for Knowledge Analysis (WEKA), and Konstanz Information Miner (KNIME) were utilized. Support vector machines (SVMs), radial basis function (RBF), polynomial, multiple layer perceptron (MLP), hyper-tangent, and sequential minimal optimization (SMO) were adopted to achieve maximum accuracy in the results. The best results obtained (>70%) are compared.
Related content




By Kumar V. Devadoss Ambeth Ramakrishnan M.
International Journal of Computer Applications in Technology, Vol. 48, Iss. 2, 2013-08 ,pp. :



