4.6 Article

A methodology for evaluating multi-objective evolutionary feature selection for classification in the context of virtual screening

Journal

SOFT COMPUTING
Volume 23, Issue 18, Pages 8775-8800

Publisher

SPRINGER
DOI: 10.1007/s00500-018-3479-0

Keywords

Feature selection; Multi-objective evolutionary algorithms; Classification; Decision trees; Virtual screening; Drug discovery

Funding

  1. European Regional Development Fund (ERDF)
  2. Fundacion Seneca del Centro de Coordinacion de la Investigacion de la Region deMurcia [18946/JLI/13]

Ask authors/readers for more resources

Virtual screening (VS) methods have been shown to increase success rates in many drug discovery campaigns, when they complement experimental approaches, such as high-throughput screening methods or classical medicinal chemistry approaches. Nevertheless, predictive capability of VS is not yet optimal, mainly due to limitations in the underlying physical principles describing drug binding phenomena. One approach that can improve VS methods is the aid of machine learning methods. When enough experimental data are available to train such methods, predictive capability can considerably increase. We show in this research work how a multi-objective evolutionary search strategy for feature selection, which can provide with small and accurate decision trees that can be very easily understood by chemists, can drastically increase the applicability and predictive ability of these techniques and therefore aid considerable in the drug discovery problem. With the proposed methodology, we find classification models with accuracy between 0.9934 and 1.00 and area under ROC between 0.96 and 1.00 evaluated in full training sets, and accuracy between 0.9849 and 0.9940 and area under ROC between 0.89 and 0.93 evaluated with tenfold cross-validation over 30 iterations, while substantially reducing the model size.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available