Improvement in protein-coding region identification based on sliding window trigonometric fast transforms using Singular Value Decomposition

Author: Hota Malaya Kumar   Srivastava Vinay Kumar  

Publisher: Inderscience Publishers

ISSN: 1748-5673

Source: International Journal of Data Mining and Bioinformatics, Vol.5, Iss.1, 2011-02, pp. : 110-127

Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.

Previous Menu Next

Abstract

In this paper, the performance of various sliding window trigonometric fast transforms for identification of protein coding regions has been analysed at the nucleotide level. It is found that, Short-Time Discrete Fourier Transform (ST-DFT) gives better identification accuracy in comparison with Short-Time Discrete Cosine Transform (ST-DCT), Short-Time Discrete Sine Transform (ST-DST) and Short-Time Discrete Hartley Transform (ST-DHT). In the proposed method, identification accuracy of protein coding regions has been improved by applying Singular Value Decomposition (SVD) on the DNA spectrum obtained using sliding window trigonometric fast transforms. The results show that, in proposed method all trigonometric fast transforms gives almost similar results in terms of area under ROC curve for GENSCAN test set.