Chapter 4: Higher Order Feature Engineering

Features as they exist in their natural form might not always be useful for a machine learning model to learn. In some cases, we need to change the functional form, such as taking the log of the feature to help the model learn. In some cases, we need to make transformations so that model can process and use the feature. For example, label encoding for categorical features for random forest model, and dummy encoding for linear models. In some other cases, we need to enrich features by including information from dependent variables through methods such as mean encoding.

Certain feature engineering can be done on the entire dataset. Whereas, some other type of feature engineering is performed on cross-validated training data and then applied to test data. This is done to avoid overfitting and data leakage. There are 3 types of features in general. These are categorical, ordinal, and numeric. We will learn about higher order feature engineering for these 3 feature types in this chapter.