FARMS: A New Algorithm for Variable Selection
Otros/as autores/as
Fecha de publicación
2015Resumen
Large datasets including an extensive number of covariates are generated these days in many different situations, for instance, in
detailed genetic studies of outbreed human populations or in complex analyses of immune responses to different infections. Aiming
at informing clinical interventions or vaccine design, methods for variable selection identifying those variables with the optimal
prediction performance for a specific outcome are crucial. However, testing for all potential subsets of variables is not feasible and
alternatives to existing methods are needed.Here,we describe a newmethod to handle such complex datasets, referred to as FARMS,
that combines forward and all subsets regression formodel selection.We apply FARMS to a host genetic and immunological dataset
of over 800 individuals from Lima (Peru) and Durban (South Africa) who were HIV infected and tested for antiviral immune
responses. This dataset includes more than 500 explanatory variables: around 400 variables with information on HIV immune
reactivity and around 100 individual genetic characteristics.We have implemented FARMS in R statistical language and we showed
that FARMS is fast and outcompetes other comparable commonly used approaches, thus providing a new tool for the thorough
analysis of complex datasets without the need for massive computational infrastructure.
Tipo de documento
Artículo
Lengua
Inglés
Palabras clave
Páginas
12 p.
Citación
Perez-Alvarez, S., Gómez, G., & Brander, C. (2015). FARMS: A new algorithm for variable selection. BioMed Research International, 2015
Este ítem aparece en la(s) siguiente(s) colección(ones)
- Articles [1531]
Derechos
Aquest document està subjecte a aquesta llicència Creative Commons
Excepto si se señala otra cosa, la licencia del ítem se describe como http://creativecommons.org/licenses/by/3.0/es/