7.5: Using Linear Model Feature Importance for Tree Models

Using linear models such as linear, logistic model and its feature importance or lasso model and the resultant selected features in a non-linear model such as random forest can be problematic. Linear models remove any feature that isn t linear. Thereby important nonlinear features can get eliminated and the resultant random forest model could be less than optimal.

Despite all the reasons for not mixing linear feature selection with non-linear models and vice versa, we can still try it. If the empirical evidence suggests to us that doing so will yield good results, then it could be worth trying. For example, if after using random forest important features, the linear regression model metric improves and becomes better than the benchmark linear regression model, we can give it a try.