Chapter 7: Feature Selection Concerning Modeling Techniques

In this chapter, we will discuss three topics. First is embedded methods for feature selection such as lasso, and feature importance in tree-based models such as the random forest. The second topic is the interplay of linear feature selection on non-linear models and vice versa. The third are considerations we need to have while doing feature selection for some modeling and preprocessing techniques, such as linear regression, SVM, PCA, etc.

The embedded method performs feature selection during the process of training the model. These algorithms are part of the modeling technique itself. It considers dependencies amongst features, as well as the relationship between the input feature and the output.

We will perform cross-validation while using techniques mentioned in this chapter. We obtain the list of features for each cross-validation. We will look at how many features are common across all cross-validation from among the selected features.