Chapter 7: Feature Selection Concerning Modeling Techniques
In this chapter, we will discuss
three topics. First is embedded methods for feature selection such as lasso,
and feature importance in tree-based models such as the random forest. The second
topic is the interplay of linear feature selection on non-linear models and
vice versa. The third are considerations we need to have while doing feature
selection for some modeling and preprocessing techniques, such as linear regression,
SVM, PCA, etc.
The embedded method performs feature
selection during the process of training the model. These algorithms are part
of the modeling technique itself. It considers dependencies amongst features,
as well as the relationship between the input feature and the output.
We will perform cross-validation
while using techniques mentioned in this chapter. We obtain the list of
features for each cross-validation. We will look at how many features are
common across all cross-validation from among the selected features.