8.6: Putting Everything Together
For certain models, such as Xgboost
and linear models, we have used GPU for training the model. For Lightgbm, we
have used a 64GB RAM machine to compute faster.
These algorithms provide the best
possible combination of features. These are not necessarily the best and ideal
combination of features. As the result, there might still be opportunity for
improvement. If we use the feature set generated by a metaheuristic algorithm,
and perform feature selection again, there might be chance for even better
performance. We will see this for hotel total room booking prediction modeling,
where we perform genetic algorithm multiple times. In each new iteration of
genetic algorithm, we use output set of features from previous iteration of
genetic algorithm. In each iteration we try to get better performance than the
previous iteration.
8.6.1 Hotel Total
Room Booking
We used all 4 metaheuristic
algorithms with the 3 models. Amongst all, Lightgbm and genetic algorithm
performed the best. While it performed best, there might still be a chance of
improving the feature combination. We used the list of features obtained from
the genetic algorithm as input features and performed the genetic algorithm a
few more times to refine the feature list even further. We performed this
exercise a total of 5 times and saw improvement for the 2nd, 3rd, and 4th
iterations. Beyond 4th iteration, there was no more improvement.
For the genetic algorithm, we used a
population of 75 and executed 25 generations for the first iteration. For
subsequent iterations, the number of generations was reduced to 20. Output from
the iteration of the genetic algorithm was used as input for the next generation
of feature selection. At each iteration of the genetic algorithm, execution
time was limited to 1200 minutes.
At 1st iteration, the number of
features from the genetic algorithm was 44. At the 2nd, 3rd, and 4th
iterations, the number of features reduced from 44 to 20, 10, and 9
respectively. Also, at the 4th iteration, RMSE was reduced to 8.63 for test and
validation datasets. RMSE for external test data also decreased to 8.79.
Results from each cross-validation are presented in figure 8.6.1.
Figure 8.6.1 performance of Lightgbm
tree model with genetic algorithm feature selection on cross-validation test,
validation, and external test data for hotel total room booking prediction
This is the best result found so far
for the hotel total rooms prediction dataset. RMSE of error is both the test
datasets are very similar, and there is little to no difference in results
across different cross-validations. Both of these factors suggest that the
model will generalize well on unseen data.
8.6.2 Hotel Booking
Cancellation
Xgboost and simulated annealing
performed the best for the hotel booking cancellation dataset. It has 0.88 as
precision for cross-validation test and validation dataset. For the external
dataset, precision was noted as 0.93. This solution did have a downside with a
low recall at 0.41 for the external test data. For simulated annealing, we used
35 iterations and 75 perturbs for performing feature selection. Figure 8.6.2.1
explains the model performance for each cross-validation.
Figure 8.6.2.1 performance of
Xgboost tree model with simulated annealing feature selection on
cross-validation test, validation, and external test data for hotel booking
cancellation
There is another solution which is
the combination of Xgboost and genetic algorithm, and it performed as second
best. It has 0.86 as precision for cross-validation test and validation
dataset. For the external dataset, precision was noted as 0.92. This solution
has a relatively better recall at 0.41 for the external test data. We used 20
generations for 75 chromosomes for the algorithm execution. Figure 8.6.2.2
explains the model performance for each cross-validation.
Both the solutions, especially the
first solution is not perfect, but nearly useful. If the precision can be
reliably said to be above 0.9 and close to 0.95, the hotel can use the
predictions from the model for overselling the rooms. Even though the model has
a lower recall. To overcome this, we can try an ensemble of two or more models
to see its impact on precision and recall. Imagine a model that can predict
hotel booking cancellations with a high degree of reliability. Even if it can
identify only 40 percent of cancellations and cannot identify the rest of the
60 percent of the cancellations, it will still be useful for the hotel to
minimize loss occurring because of 40% of cancellations.
Comparing the solutions obtained
through all 4 metaheuristics algorithms, these are the best result achieved for
this dataset, in comparison to other feature selection methods.
Figure 8.6.2.2 performance of
Xgboost tree model with genetic algorithm feature selection on cross-validation
test, validation, and external test data for hotel booking cancellation
8.6.3 Car Sales
We tried different combinations of
metaheuristics algorithms and models. The best performance was achieved by the
combination of Lightgbm and simulated annealing. For cross-validation test and
validation data, RMSE was 197495, whereas for the external test data it is
224034. It is better than the results reported in chapter 7. For simulated
annealing, we used 35 iterations with 75 perturbations.
Figure 8.6.3 shows the model
performance across different cross-validations. RMSE is not very consistent
across all cross-validations. RMSE is different across different test datasets.
However, the major issue in this dataset is that RMSE is still very high. For
this dataset, none of the feature engineering and feature selection helped us
achieve a workable model that can predict the price of used cars reliably. This
indicates that the dataset requires data cleaning and domain-specific feature
engineering. Domain knowledge in particular can help in organizing data that
can be easier for the model to learn, finding anomalies, and treating it to
ensure a dataset with the least amount of noise, and finally in creating
features that permeate from domain knowledge.
Figure 8.6.3 performance of the
Lightgbm model with simulated annealing feature selection on cross-validation
test, validation, and external test data for used car price prediction.
8.6.4 Coupon
Recommendation
For this dataset, Xgboost performed
the best with simulated annealing feature selection. It has 0.78 as precision
for cross-validation test and validation dataset. For the external dataset,
precision was noted as 0.79. This solution did have a downside with a low
recall at 0.59 for the external test data. For simulated annealing, we used 35
iterations and 75 perturbs for performing feature selection. Figure 8.6.4
explains the model performance for each cross-validation.
This is the best performance that
could be achieved for the dataset, across all feature selection methods and
model techniques. However, these results are not good enough to be accepted as
a reliable model. If either the precision or recall could have been close to
0.9 or higher, we could have obtained a reliable model. Hence, although model
performance across different cross-validations is very similar, and there is a
very small difference between the two different test datasets, we will discard
this model.
In the absence of domain knowledge
for this dataset, we at this point halt any further effort to improve the model
for this dataset.
Figure 8.6.4 performance of the Xgboost model with simulated
annealing feature selection on cross-validation test, validation, and external
test data for coupon recommendation prediction.