Variable selection in logistic regression model with genetic algorithm Journal Article


Authors: Zhang, Z.; Trevino, V.; Hoseini, S. S.; Belciug, S.; Boopathi, A. M.; Zhang, P.; Gorunescu, F.; Subha, V.; Dai, S.
Article Title: Variable selection in logistic regression model with genetic algorithm
Abstract: Variable or feature selection is one of the most important steps in model specification. Especially in the case of medical-decision making, the direct use of a medical database, without a previous analysis and preprocessing step, is often counterproductive. In this way, the variable selection represents the method of choosing the most relevant attributes from the database in order to build a robust learning models and, thus, to improve the performance of the models used in the decision process. In biomedical research, the purpose of variable selection is to select clinically important and statistically significant variables, while excluding unrelated or noise variables. A variety of methods exist for variable selection, but none of them is without limitations. For example, the stepwise approach, which is highly used, adds the best variable in each cycle generally producing an acceptable set of variables. Nevertheless, it is limited by the fact that it commonly trapped in local optima. The best subset approach can systematically search the entire covariate pattern space, but the solution pool can be extremely large with tens to hundreds of variables, which is the case in nowadays clinical data. Genetic algorithms (GA) are heuristic optimization approaches and can be used for variable selection in multivariable regression models. This tutorial paper aims to provide a step-by-step approach to the use of GA in variable selection. The R code provided in the text can be extended and adapted to other data analysis needs. © Annals of Translational Medicine.
Keywords: medical research; data analysis; genetic algorithm; logistic regression; noise; variable selection; article; galgo; genetic algorithm (ga)
Journal Title: Annals of Translational Medicine
Volume: 6
Issue: 3
ISSN: 2305-5839
Publisher: AME Publishing Company  
Date Published: 2018-02-01
Start Page: 45
Language: English
DOI: 10.21037/atm.2018.01.15
PROVIDER: scopus
PMCID: PMC5879502
PUBMED: 29610737
DOI/URL:
Notes: Article -- Export Date: 1 March 2018 -- Source: Scopus
Altmetric
Citation Impact
MSK Authors