Chapter 4: Higher Order Feature Engineering
Features as they exist in their
natural form might not always be useful for a machine learning model to learn.
In some cases, we need to change the functional form, such as taking the log of
the feature to help the model learn. In some cases, we need to make
transformations
so that model can process and use the feature. For example, label encoding for
categorical features for random forest model, and dummy encoding for linear
models. In some other cases, we need to enrich features by including
information from dependent variables through methods such as mean encoding.
Certain feature engineering can be
done on the entire dataset. Whereas, some other type of feature engineering is
performed on cross-validated training data and then applied to test data. This
is done to avoid overfitting and data leakage. There are 3 types of features in
general. These are categorical, ordinal, and numeric. We will learn about
higher order feature engineering for these 3 feature types in this chapter.