Visualizing Random Forest Decision Boundaries on the Wine Dataset with dtreeviz

This tutorial demonstrates how to load the wine dataset, train a Random Forest classifier, evaluate its accuracy and confusion matrix, and visualize decision boundaries and misclassifications using scikit‑learn and the dtreeviz library.

Model Perspective
Model Perspective
Model Perspective
Visualizing Random Forest Decision Boundaries on the Wine Dataset with dtreeviz

Various classification models such as logistic regression, K‑nearest neighbors, decision trees, etc., can predict the class of unknown data based on features. This article uses the wine dataset to demonstrate how to evaluate predictions with accuracy, confusion matrix, and how to visualize decision boundaries and misclassifications using Random Forest and the dtreeviz library.

Data

We load the wine dataset with load_wine(). The dataset contains 13 numeric features for 178 samples. For this example we focus on the flavanoids and proline features, each sample belonging to one of three classes (0, 1, 2).

Model

We train a Random Forest classifier.

import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_wine
from sklearn.metrics import confusion_matrix, accuracy_score
import matplotlib.pyplot as plt
import dtreeviz
from dtreeviz import decision_boundaries
wine = load_wine()
X = wine.data[:, [12, 6]]  # proline, flavanoids
y = wine.target
rf = RandomForestClassifier(n_estimators=50, min_samples_leaf=20, n_jobs=-1)
rf.fit(X, y)

Evaluation Results

Accuracy

Using accuracy_score we obtain an accuracy of 0.90.

y_pred = rf.predict(X)
accuracy_score(y, y_pred)

Confusion Matrix

The confusion matrix visualizes correct and incorrect predictions.

import seaborn as sn
sn.heatmap(confusion_matrix(y, y_pred), annot=True)
Confusion matrix
Confusion matrix

Original vs Predicted Data

Scatter plots of the two features colored by true class and by predicted class.

fig,axes = plt.subplots(1,2,figsize=(8,3.8),dpi=300)
features = ['proline','flavanoids']
df1 = pd.DataFrame(X, columns=features)
df1['target'] = wine.target
df1['prediction'] = rf.predict(X)
sn.scatterplot(x='proline', y='flavanoids', hue='target', data=df1, ax=axes[0])
sn.scatterplot(x='proline', y='flavanoids', hue='prediction', data=df1, ax=axes[1])
Original vs predicted
Original vs predicted

Decision Boundaries

Using dtreeviz.decision_boundaries we plot the classification regions and highlight misclassified points.

fig,axes = plt.subplots(1,2,figsize=(8,3.8),dpi=300)
decision_boundaries(rf, X, y, ax=axes[0], feature_names=['proline','flavanoid'])
decision_boundaries(rf, X, y, ax=axes[1],
    show=['instances','boundaries','misclassified'],
    feature_names=['proline','flavanoid'])
plt.show()
Decision boundaries
Decision boundaries

One‑Dimensional Boundary

We can also visualize the boundary using a single feature ( proline).

x = df1[['proline']].values
y = df1['target'].astype('int').values
rf = RandomForestClassifier(n_estimators=10, min_samples_leaf=10, n_jobs=-1)
rf.fit(x, y)
decision_boundaries(rf, x, y,
    feature_names=['proline'],
    target_name='wine_type',
    colors={'scatter_marker_alpha': .2},
    figsize=(5,1.5))
1D decision boundary
1D decision boundary

This demonstrates how to plot classification results, decision boundaries, and misclassifications for a Random Forest model.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

classificationRandom Forestscikit-learnwine datasetdtreevizdecision boundary
Model Perspective
Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.