Genetic Algorithms (GAs) are a popular variable selection approach that is inspired by the natural selection of the individuals of a population during its evolution. In the GA framework, each gene represents a variable and each chromosome (sequence of genes/variables) represents a model. The evolution of the population (the default population consists of 30 chromosomes, that is, individuals) is determined by two processes: in the crossover step, pairs of chromosomes generate new individuals according to a crossover probability (default 50%), while in the mutation step genes of each chromosome can change according to a mutation probability (default 1%). When a new chromosome has better performance than that of the already existing ones, it enters the population and the worst model is discarded.
After evolution, difierent strategies can be used to select the fiinal subset of variables. In this toolbox, the GA evolution is independently repeated for a fixed number of times (called runs) and the relative occurrence frequency of each variable in the best models is calculated. Then, the user can set a frequency threshold and only the variables with
occurrence frequencies higher than the threshold will be further processed by a forward selection or All Subset models procedure. Finally, a list of models (including the best subsets of selected variables) is proposed, from which the user can choose the preferred model.