4.4: Conclusion

For all the different types of encoding and transformation discussed, a big question arises as to which encoding to use for what feature. There are two ways to approach this. The first method is doing the correct transformation based on the nature of the data, aided by domain knowledge. We can choose a specific type of transformation for a feature, based on the nature of the data.

The second method follows the principle of doing the least harm. We should try all encoding and transformation possible for a feature based on its type. The only exceptions should be where it might cause harm. i.e., if the specific type of transformation is not suitable for the feature type or for the modeling technique. For example, we cannot use higher order feature engineering suitable for numerical features and apply it on categorical features.

We should also consider the suitability of the higher order feature engineering technique for the feature, based on the modeling technique being used. For example, we cannot use label-encoded categorical features in linear models.

We can create multiple higher order features for an original feature. If the new features are beyond the computational capacity, we can select a few features from the list of higher order features. For this, we can use techniques such as correlation and hypothesis testing techniques, namely F-test and Chi-square test. In some cases, we might still end up with more than one type of higher order representation for an original feature. In such situations, we can keep these, as they can help improve the model performance.