Artificial Intelligence 12 min read

How Shapley Values Reveal Fair Profit Splits and Explain Machine Learning Models

This article introduces the Shapley value concept, its fairness axioms, demonstrates its use in a profit‑allocation problem and in interpreting machine‑learning models with SHAP, and provides complete Python implementations for both cases.

Model Perspective

Mar 4, 2023

Shapley Value

Shapley value is a game‑theoretic method for fairly allocating payoff among players, introduced by Lloyd Shapley in 1953. It is defined by a formula that averages each player’s marginal contribution over all possible coalitions.

Shapley Axioms

Shapley proposed four axioms that any cooperative solution must satisfy:

Symmetry – players with identical roles receive identical values.

Marginal contribution – each player’s contribution to any coalition is accounted for.

Additivity – the value for a combined game equals the sum of values for the component games.

Null player – a player who does not participate receives zero.

These axioms guarantee fairness and uniqueness, making Shapley values widely used in game theory and machine learning.

Case Study 1: Profit Allocation

Problem

Three partners A, B, C cooperate in a business. Pairwise profits are: A+B = 7, A+C = 5, B+C = 4, all three together = 10, and each alone = 1. How should the total profit be divided?

Analysis and Modeling

Let the profit shares be x_A, x_B, x_C with x_A + x_B + x_C = 10. Using the Shapley formula, we compute the marginal contributions of each player across all coalitions and obtain the unique fair allocation.

Solution

Applying the Shapley calculation yields the profit distribution [4.0, 3.5, 2.5] for A, B, and C respectively.

Python Implementation

from copy import copy
def marginal_contribution(player, coalition, payoff_function):
    coalition_copy = coalition.copy()
    coalition_copy.remove(player)
    return payoff_function(coalition) - payoff_function(coalition_copy)

import math
from itertools import permutations, combinations

n_players = 3
def payoff(coalition):
    if len(coalition) == 0:
        return 0
    elif len(coalition) == 1:
        return 1
    elif len(coalition) == 2:
        if 1 in coalition and 2 in coalition:
            return 7
        elif 1 in coalition and 3 in coalition:
            return 5
        elif 2 in coalition and 3 in coalition:
            return 4
    elif len(coalition) == 3:
        return 10

shapley_values = [0, 0, 0]
for i in range(1, n_players+1):
    for coalition_size in range(1, n_players+1):
        for coalition in combinations(range(1, n_players+1), coalition_size):
            if i not in coalition:
                continue
            w = math.factorial(coalition_size-1) * math.factorial(n_players - coalition_size) / math.factorial(n_players)
            shapley_values[i-1] += w * marginal_contribution(i, list(coalition), payoff)
print(shapley_values)

Case Study 2: Machine Learning

Problem

Using the Iris dataset, compute the contribution of each feature (sepal length, sepal width, petal length, petal width) to the model’s prediction of flower species.

Analysis and Modeling

SHAP (SHapley Additive exPlanations) applies the Shapley value concept to explain the output of machine‑learning models. It distributes the prediction among features based on their marginal contributions across all feature subsets.

The SHAP library provides TreeSHAP for tree‑based models and approximate algorithms such as KernelSHAP and DeepSHAP for other model types.

Solution

We train a RandomForest classifier on the Iris data and use shap.TreeExplainer to obtain feature contributions.

Python Implementation

import shap
shap.initjs()
from sklearn.datasets import load_iris
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

data = load_iris()
df = pd.DataFrame(data['data'], columns=data['feature_names'])
df['target'] = data['target']
X, y = df.iloc[:, :-1], df['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7)

rf = RandomForestClassifier(n_estimators=500)
rf.fit(X_train, y_train)
explainer = shap.TreeExplainer(rf, link='logit')
shap_values_all = explainer.shap_values(X_test)
shap.summary_plot(shap_values_all, X_test, plot_type='bar')

The resulting bar plot shows that petal width has the largest impact on the model’s predictions, followed by petal length and sepal length .

References:

Liang Jin, Chen Xiongdá, Zhang Hualong, Xiang Jialiang, “Mathematical Modeling Lecture (2nd ed.)”

Shao Ping, Yang Jianying, Su Sida, “Interpretable Machine Learning: Models, Methods, and Practice”

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python Game Theory Shapley value SHAP profit allocation

Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.