Chapter 5: Interaction Effect Feature Engineering

The relationship between the independent variable and the dependent variable is called the main effect. For example, the impact of calorie intake on cholesterol levels could be referred to as the main effect. Some features might interact with each other and can affect each other's values. For example, body weight can impact calorie intake. People with higher body weights tend to consume more calories. Body weight and calorie intake, both can impact cholesterol levels in the body. Interaction effects indicate that a third variable influences the relationship between the dependent variable and the independent variable. The effect of one independent variable on the dependent variable is dependent on the value of another independent variable.

Interaction effects can explain additional variation in the dependent variable, more than what individual features could do alone. Interaction effect exists amongst some independent variables. It is important to identify these interaction effects and represent them as new features. In the next step, we need to test the validity of the interaction effect. If we keep adding interaction effects mechanically amongst all existing features in a dataset, without any backing of domain knowledge or established principles, it will lead to overfitting. In this chapter, we will discuss methods other than domain knowledge for identifying interaction effects.