A Primer to Interpretable Machine Learning

January 17, 2023

Tags

Why is understanding a model important? “If a model cannot be communicated clearly, except from computer to computer, its contribution will be minimal”- Wagner & Rondinelli (2016) Interpretability has become one of the most important topics in machine learning and it’s something that every data scientist needs to be familiar with. For hundreds of years we have had simple interpretable models like linear regression and rules-based systems but in recent years there’s obviously been a huge rise in more complex bigger nonlinear models and ofcourse predictions from these are not always so easy to explain so as we start to use these more powerful non-linear models to actually make decisions on real world matters then it’s inevitable that our attention must turn to interpretability and explainability.

What is Interpretable Machine Learning?

Interpretable machine learning is a branch of machine learning that focuses on the development and use of models that are understandable by humans. It seeks to bridge the gap between human and machine understanding by creating models that can explain their decisions in terms of natural language, visualizations, and other methods that can be easily understood. By doing so, it allows humans to understand the inner workings of the model, as well as to assess its accuracy and reliability. It refers to methods and models that make the behavior and predictions of machine learning systems understable to humans.

After exploring the concepts of interpretability, let’s understand interpretable models such as decision trees, decision rules, linear regression & model-agnostic methods for interpreting black box models such as feature importance and accumulated local effects explaining individual predictions with Shapley values and LIME.

Why Interpretable Machine Learning?

It is a genuine concern to stakeholders across the domain. No longer an esoteric consternation, or a "nice to have" for practitioners, the importance of interpretable machine learning and AI has been made known to more and more people over the past number of years for a wide array of different reasons. Following are some of the benefits of Interpretable Machine Learning:

Fairness: Ensuring that predictions are unbiased and do not explicitly or implicitly discriminate against protected groups.
Privacy: Ensuring that sensitive information in the data is protected.
Reliability or Robustness: Ensuring that small changes in the input do not lead to large changes in the prediction.
Causality: Check that only casual relationships are picked up.
Trust: It is easier for humans to trust a system that explains its decisions compared to a black box.
Legal: Compliance (like from GDPR) emphasize on Right to Explanation

What makes an Interpretable Model?

‍

‍

When humans easily understand the decisions a machine learning model makes, we have an “interpretable model”. In short, we want to know what caused a specific decision. If we can tell how a model came to a decision, then that model is interpretable.

For example,

We can train a random forest machine learning model to predict whether a specific passenger survived the sinking of the Titanic in 1912. The model uses all the passenger’s attributes such as their ticket class, gender, and age to predict whether they survived. Now let’s say our random forest model predicts a 93% chance of survival for a particular passenger. Random forest models can easily consist of hundreds or thousands of “trees”.This makes it nearly impossible to grasp their reasoning. But, we can make each individual decision interpretable using an approach borrowed from game theory.

‍

‍

What is a Model Prediction?

Model prediction in interpretable machine learning is the process of applying a model to a dataset in order to make predictions about the future. The model is trained on existing data, and then used to predict outcomes for new data points. The predictions are based on patterns discovered in the training data and can be used to make decisions about future events or trends. The goal of interpretable machine learning is to make the predictions understandable and provide insight into why certain decisions have been made.

There is a whole plethora of techniques out there to explain why a model made a certain prediction? Some models like low dimensional linear regression are intrinsically interpretable, you can just look at the model coefficients and that tells you exactly how the model is working under the hood then there is this whole suite of methods that will actually work with any ML Model like training a local surrogate or a global surrogate. Shapley values are an interesting technique that allows you to distribute blame for the prediction amongst the input features in a really theoretically sound way in a principally and then there are domain specific methods.

For example: To explain image models, you can try to highlight the most relevant parts of an input image by making saliency maps and there’s more you can look at things like example based explanations where you try to find the smallest change in the input data that would cause the output prediction to change so may be with a interpretability toolkit we can start to dispel the myth that machine learning models are all just black boxes that can’t be understood and can’t be trusted.

Techniques to interpret most popular machine learning models:

LIME (Local Interpretable Model-Agnostic Explanations): This approach involves approximating a complex model in the neighborhood of the prediction of interest with a simple interpretable model, such as a linear model or decision tree. Figure 3 illustrates the three main steps of applying LIME.

‍

The above figure shows- By fitting a lime object in MATLAB, you can obtain LIME explanations via a simple interpretable model.

Partial Dependence (PDP) and Individual Conditional Expectation (ICE) Plots:

With these methods, you examine the effect of one or two predictors on the overall prediction by averaging the output of the model over all the possible feature values.

‍

‍

The above figure shows a partial dependence plot that was generated with the MATLAB function plot Partial Dependence.

Shapley Values: This technique explains how much each predictor contributes to a prediction by calculating the deviation of a prediction of interest from the average. This method is particularly popular within the finance industry because it is derived from game theory as its theoretical underpinning, and because it satisfies the regulatory requirement of providing complete explanations: the sum of the Shapley values for all features corresponds to the total deviation of the prediction from the average. The MATLAB function Shapley computes Shapley values for a query point of interest.

The above figure shows “The Shapley values indicate how much each predictor deviates from the average prediction at the point of interest, indicated by the vertical line at zero”

How to choose an Interpretability Method?

Different interpretability methods have their own limitations. A best practice is to be aware of those limitations as you fit these algorithms to the various use cases. Interpretability tools help you understand why a machine learning model makes the predictions that it does. These approaches are likely to become increasingly relevant as regulatory and professional bodies continue to work towards a framework for certifying AI for sensitive applications, such as autonomous transportation and medicine.

Conclusion:

Machine learning can be interpretable, and this means we can build models that humans understand and trust. Carefully constructed machine learning models can be verifiable and understandable. That's why we can use them in highly regulated areas like medicine and finance.

Looking for GPUs for training your ML models? Request for a free trial with E2E Cloud: https://zfrmz.com/LK5ufirMPLiJBmVlSRml

Sign up for Free Trial

Latest Blogs

A vector illustration of a tech city using latest cloud technologies & infrastructure

A Primer to Interpretable Machine Learning

Example H2

What is Interpretable Machine Learning?

Why Interpretable Machine Learning?

Fairness: Ensuring that predictions are unbiased and do not explicitly or implicitly discriminate against protected groups.
Privacy: Ensuring that sensitive information in the data is protected.
Reliability or Robustness: Ensuring that small changes in the input do not lead to large changes in the prediction.
Causality: Check that only casual relationships are picked up.
Trust: It is easier for humans to trust a system that explains its decisions compared to a black box.
Legal: Compliance (like from GDPR) emphasize on Right to Explanation

What makes an Interpretable Model?

‍

‍

For example,

We can train a random forest machine learning model to predict whether a specific passenger survived the sinking of the Titanic in 1912. The model uses all the passenger’s attributes such as their ticket class, gender, and age to predict whether they survived. Now let’s say our random forest model predicts a 93% chance of survival for a particular passenger. Random forest models can easily consist of hundreds or thousands of “trees”.This makes it nearly impossible to grasp their reasoning. But, we can make each individual decision interpretable using an approach borrowed from game theory.

‍

‍

What is a Model Prediction?

Techniques to interpret most popular machine learning models:

LIME (Local Interpretable Model-Agnostic Explanations): This approach involves approximating a complex model in the neighborhood of the prediction of interest with a simple interpretable model, such as a linear model or decision tree. Figure 3 illustrates the three main steps of applying LIME.

‍

The above figure shows- By fitting a lime object in MATLAB, you can obtain LIME explanations via a simple interpretable model.

Partial Dependence (PDP) and Individual Conditional Expectation (ICE) Plots:

With these methods, you examine the effect of one or two predictors on the overall prediction by averaging the output of the model over all the possible feature values.

‍

‍

The above figure shows a partial dependence plot that was generated with the MATLAB function plot Partial Dependence.

Shapley Values: This technique explains how much each predictor contributes to a prediction by calculating the deviation of a prediction of interest from the average. This method is particularly popular within the finance industry because it is derived from game theory as its theoretical underpinning, and because it satisfies the regulatory requirement of providing complete explanations: the sum of the Shapley values for all features corresponds to the total deviation of the prediction from the average. The MATLAB function Shapley computes Shapley values for a query point of interest.

The above figure shows “The Shapley values indicate how much each predictor deviates from the average prediction at the point of interest, indicated by the vertical line at zero”

How to choose an Interpretability Method?

Conclusion:

Looking for GPUs for training your ML models? Request for a free trial with E2E Cloud: https://zfrmz.com/LK5ufirMPLiJBmVlSRml

Latest Blogs

A Primer to Interpretable Machine Learning

Table of Contents

A Primer to Interpretable Machine Learning

Table of Contents

How Does RAG Improve the Accuracy of LLM Responses?

Top 10 Cloud GPU Providers in 2025

What is Retrieval-Augmented Generation (RAG)?

AI Inference vs Training: Understanding Key Differences

Sovereign Cloud: India's Key to Digital Independence in the AI Age

E2E Sovereign Cloud Platform: Revolutionizing Cloud Sovereignty

Top 8 Generative AI Applications in 2025

A Comparison between TIR Containerized VMs vs Traditional VMs

Accelerate Your AI Application Development Using TIR Containerized VMs

The AI Revolution in the Automotive Industry: Steering Toward a Smarter, Safer, and Sustainable Future