How Shapley Values Reveal Fair Profit Splits and Explain Machine Learning Models
This article introduces the Shapley value concept, its fairness axioms, demonstrates its use in a profit‑allocation problem and in interpreting machine‑learning models with SHAP, and provides complete Python implementations for both cases.
Shapley Value
Shapley value is a game‑theoretic method for fairly allocating payoff among players, introduced by Lloyd Shapley in 1953. It is defined by a formula that averages each player’s marginal contribution over all possible coalitions.
Shapley Axioms
Shapley proposed four axioms that any cooperative solution must satisfy:
Symmetry – players with identical roles receive identical values.
Marginal contribution – each player’s contribution to any coalition is accounted for.
Additivity – the value for a combined game equals the sum of values for the component games.
Null player – a player who does not participate receives zero.
These axioms guarantee fairness and uniqueness, making Shapley values widely used in game theory and machine learning.
Case Study 1: Profit Allocation
Problem
Three partners A, B, C cooperate in a business. Pairwise profits are: A+B = 7, A+C = 5, B+C = 4, all three together = 10, and each alone = 1. How should the total profit be divided?
Analysis and Modeling
Let the profit shares be x_A, x_B, x_C with x_A + x_B + x_C = 10. Using the Shapley formula, we compute the marginal contributions of each player across all coalitions and obtain the unique fair allocation.
Solution
Applying the Shapley calculation yields the profit distribution [4.0, 3.5, 2.5] for A, B, and C respectively.
Python Implementation
<code>from copy import copy
def marginal_contribution(player, coalition, payoff_function):
coalition_copy = coalition.copy()
coalition_copy.remove(player)
return payoff_function(coalition) - payoff_function(coalition_copy)
import math
from itertools import permutations, combinations
n_players = 3
def payoff(coalition):
if len(coalition) == 0:
return 0
elif len(coalition) == 1:
return 1
elif len(coalition) == 2:
if 1 in coalition and 2 in coalition:
return 7
elif 1 in coalition and 3 in coalition:
return 5
elif 2 in coalition and 3 in coalition:
return 4
elif len(coalition) == 3:
return 10
shapley_values = [0, 0, 0]
for i in range(1, n_players+1):
for coalition_size in range(1, n_players+1):
for coalition in combinations(range(1, n_players+1), coalition_size):
if i not in coalition:
continue
w = math.factorial(coalition_size-1) * math.factorial(n_players - coalition_size) / math.factorial(n_players)
shapley_values[i-1] += w * marginal_contribution(i, list(coalition), payoff)
print(shapley_values)
</code>Case Study 2: Machine Learning
Problem
Using the Iris dataset, compute the contribution of each feature (sepal length, sepal width, petal length, petal width) to the model’s prediction of flower species.
Analysis and Modeling
SHAP (SHapley Additive exPlanations) applies the Shapley value concept to explain the output of machine‑learning models. It distributes the prediction among features based on their marginal contributions across all feature subsets.
The SHAP library provides TreeSHAP for tree‑based models and approximate algorithms such as KernelSHAP and DeepSHAP for other model types.
Solution
We train a RandomForest classifier on the Iris data and use shap.TreeExplainer to obtain feature contributions.
Python Implementation
<code>import shap
shap.initjs()
from sklearn.datasets import load_iris
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
data = load_iris()
df = pd.DataFrame(data['data'], columns=data['feature_names'])
df['target'] = data['target']
X, y = df.iloc[:, :-1], df['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7)
rf = RandomForestClassifier(n_estimators=500)
rf.fit(X_train, y_train)
explainer = shap.TreeExplainer(rf, link='logit')
shap_values_all = explainer.shap_values(X_test)
shap.summary_plot(shap_values_all, X_test, plot_type='bar')
</code>The resulting bar plot shows that petal width has the largest impact on the model’s predictions, followed by petal length and sepal length .
References:
Liang Jin, Chen Xiongdá, Zhang Hualong, Xiang Jialiang, “Mathematical Modeling Lecture (2nd ed.)”
Shao Ping, Yang Jianying, Su Sida, “Interpretable Machine Learning: Models, Methods, and Practice”
Model Perspective
Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.