Chapter 9: Explaining Model and Model Predictions to Layman
9.1: Introduction
Interpretability for machine
learning models is the degree to which a layman can understand the cause of a
specific prediction.
As long as further action will be
taken on prediction results that cost time or money, it will be helpful to
understand the root cause behind algorithm prediction. Unless the model is for
a toy dataset or if it's not going to be used for any insight or action, it is
useful to have justification for model predictions. An answer could be sought
for the exact pattern that the model captured, to give the specific prediction.
Even though prediction results came
from a scientific and high-end algorithm, it still warrants a justification.
Explainability can help identify incorrect predictions more clearly. This can
again be useful during periodic model retraining. Usually, we retrain models
when covariate shifts happen. We also retrain the model when the model predicts
the wrong labels for a class that was under-represented in the data. After we
have enough labeled data for the class that was previously underrepresented, we
can consider retraining. Explainable models can help identify false positives
easily, as these cases will be hard to justify. This helps in building robust
and high-quality model retraining datasets to further improve the model.
Some models are inherently
explainable, while others can be explained with the help of explanation
techniques.