7.5: Using Linear Model Feature Importance for Tree Models
Using linear models such as linear,
logistic model and its feature importance or lasso model and the resultant
selected features in a non-linear model such as random forest can be
problematic. Linear models remove any feature that isn t linear. Thereby
important nonlinear features can get eliminated and the resultant random forest
model could be less than optimal.
Despite all the reasons for not
mixing linear feature selection with non-linear models and vice versa, we can
still try it. If the empirical evidence suggests to us that doing so will yield
good results, then it could be worth trying. For example, if after using random
forest important features, the linear regression model metric improves and becomes
better than the benchmark linear regression model, we can give it a try.