Artificial Intelligence 11 min read

Understanding SHAP: How Shapley Values Explain Black‑Box Models

This article explains the SHAP (Shapley Additive Explanation) method, its theoretical foundations in game theory, the computation of Shapley Values, various algorithmic approximations like TreeSHAP and DeepSHAP, practical code examples, and the strengths and limitations of using SHAP for model interpretability.

Model Perspective
Model Perspective
Model Perspective
Understanding SHAP: How Shapley Values Explain Black‑Box Models

SHAP Post‑hoc Explanation Method

SHAP (Shapley Additive Explanation) is a post‑hoc explanation method that borrows ideas from game theory. It measures each feature’s marginal contribution (Shapley Value) to a model’s prediction, enabling interpretation of black‑box models. Because exact Shapley Value computation is costly, several approximations have been proposed, leading to variants such as TreeSHAP for tree models, DeepSHAP for neural networks, and model‑agnostic Kernel SHAP.

Basic Idea of SHAP

In cooperative game theory, the marginal contribution of a player equals the difference between the output when the player participates and when they do not. Translating this to machine learning, the “players” become features, and the “game” is the model prediction. For each feature we compute the expected marginal contribution over all possible feature subsets, which yields the Shapley Value.

SHAP is additive: the model output for a sample can be expressed as the sum of a base value (average prediction) and the Shapley Values of all features.

The explanation model must satisfy three properties: local accuracy (the sum equals the model prediction), missingness (features absent from a sample receive zero contribution), and consistency (changing a model so a feature’s contribution increases must not decrease its Shapley Value).

Shapley Value

From local accuracy, the model prediction for a sample equals the base value plus the sum of feature Shapley Values. The Shapley Value for a feature is the expected difference in model output when the feature is present versus absent, averaged over all subsets. The probability of each subset is derived combinatorially.

Choose a subset of size S from the full set of M features; probability = 1/M .

From the remaining features, choose a subset of size |S| ; probability is computed accordingly.

Multiply the two probabilities to obtain the probability of a particular subset.

Example Calculation

An example with a black‑box model having three features illustrates the computation of Shapley Values. The table of subset predictions and the resulting Shapley Values are shown.

SHAP Implementation Algorithms

Computing exact Shapley Values requires evaluating the model on every subset of features, which is exponential in the number of features. Approximation algorithms reduce this cost. Model‑agnostic Kernel SHAP works for any model, while model‑specific methods such as TreeSHAP (for tree ensembles) and DeepSHAP (for neural networks) exploit structural properties. TreeSHAP, introduced by Lundberg and Lee (2018), is the most widely used.

TreeSHAP Example

TreeSHAP efficiently computes Shapley Values for tree models like Random Forest, XGBoost, and LightGBM. A regression tree example demonstrates the step‑by‑step calculation, with intermediate tables and visualizations.

Code Implementation

<code>import shap
shap.initjs()
from sklearn.datasets import load_breast_cancer, load_iris
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

import matplotlib.pyplot as plt

data = load_iris()
df = pd.DataFrame(data['data'], columns=data['feature_names'])
df['target'] = data['target']
X, y = df.iloc[:, :-1], df['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7)

rf = RandomForestClassifier(n_estimators=500)
rf.fit(X_train, y_train)
explainer = shap.TreeExplainer(rf, link='logit')
shap_values = explainer.shap_values(X_test.iloc[31, :])
shap.force_plot(explainer.expected_value[1], shap_values[1], X_test.iloc[31, :], link='logit')
shap_values_all = explainer.shap_values(X_test)
shap.summary_plot(shap_values_all, X_test, plot_type='bar')
</code>

Advantages and Disadvantages of SHAP

SHAP has three main strengths: it is theoretically grounded in game theory, it fairly distributes each feature’s contribution, and Shapley Values can be used to compare predictions across samples. Compared with LIME, SHAP provides consistency and additive explanations.

Its drawbacks stem from the computational cost of Shapley Values. Exact calculation is exponential; even approximations can be expensive for non‑tree models. Moreover, many approximations assume feature independence, which is often violated, reducing explanation accuracy.

References

Shaw Ping, Yang Jiaying, Su Sida, "Interpretable Machine Learning: Methods and Practice"

machine learningSHAPexplainable AImodel interpretationShapley Values
Model Perspective
Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.