Predictive QSAR Models for Polyspecific Drug Targets: The Importance of Feature Selection

E-ISSN： 1875-6697|4|2|91-110

ISSN： 1573-4099

Source： Current Computer - Aided Drug Design, Vol.4, Iss.2, 2008-06, pp. : 91-110

Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.

Previous Menu Next

Abstract

Since the advent of QSAR (quantitative structure activity relationship) modeling quantitative representations of molecular structures are encoded in terms of information-preserving descriptor values. Nowadays, a nearly infinite variety of potential descriptors is available and descriptor selection is no longer a task which can be done manually. There is an increasing need for automation in order to reduce the dimensionality of the descriptor space. Classical feature selection (FS) and dimensionality reduction (DR) methods like principal component analysis, which relies on the selection of those descriptors that contribute most to the variance of a data set, often fail in providing the best classification result. More sophisticated methods like genetic algorithms, self-organizing-maps and stepwise linear discriminant analysis have proven to be useful techniques in the process of selecting descriptors with a significant discriminative power.The topic FS and DR becomes even more important when predictive models are approached which should describe the QSAR of highly promiscuous target proteins. The ABC-transporter family, the cardiac hERG-potassium channel, and the hepatic cytochrom-P450-family are classical representatives of such poly-specific proteins. In this case the interaction pattern is a rather complex one and thus the selection of the most predictive descriptors needs advanced methods. This review surveys FS and DR methods that have recently been successfully applied to classify ligands of poly-specific target proteins.