Artificial Intelligence 15 min read

Master Logistic Regression: Binary, Multiclass, and Ordered Extensions with Python

This article explains logistic regression and its extensions—binary, multiclass (softmax), and ordered logistic regression—covering mathematical foundations, optimization objectives, real‑world applications, and Python implementations using scikit‑learn with code examples and visual illustrations.

Model Perspective

Aug 23, 2023

Master Logistic Regression: Binary, Multiclass, and Ordered Extensions with Python

Classification is one of the most common tasks in machine learning and statistics. Whether predicting disease presence, loan default, or ad clicks, classification methods are crucial, and logistic regression remains a classic, widely used technique despite its name containing “regression.” This article explores logistic regression and its two extensions: multiclass logistic regression and ordered logistic regression, which enable handling more complex classification problems.

1 Binary Logistic Regression

Logistic regression is a statistical method widely used for binary classification tasks.

1.1 Mathematical Model

Linear function

First, we create a linear function to describe the relationship between features and the target variable, similar to linear regression:

where the coefficients are the model parameters and the inputs are the features.

Logistic (sigmoid) function

We then transform the linear output into a probability between 0 and 1 using the sigmoid function, allowing a continuous output to become a classification output:

where e is the base of natural logarithms. As the linear value approaches positive infinity, the sigmoid approaches 1; as it approaches negative infinity, it approaches 0.

Decision boundary

To classify, we set a decision threshold (e.g., 0.5). If the predicted probability exceeds this value, the observation is assigned to class 1; otherwise, to class 0.

1.2 Optimization Objective

The goal of logistic regression is to maximize the likelihood function (or equivalently minimize the log‑loss). The model’s likelihood for a dataset is:

For a single sample, the likelihood contribution is the predicted probability; the overall likelihood is the product over all samples, and we usually work with the log‑likelihood for mathematical convenience.

Maximizing the log‑likelihood (or minimizing negative log‑likelihood) is the objective.

Parameters are typically estimated using gradient descent or other optimization algorithms.

1.3 Applications

Logistic regression is applied in many domains, including:

Medicine – predicting whether a patient has a disease (yes/no).

Finance – predicting whether a loan applicant will default.

Marketing – predicting whether a customer will purchase a product.

Social media – predicting whether a user will click an ad or link.

2 Multiclass Logistic Regression

Multiclass logistic regression (also called Softmax regression) extends logistic regression to handle more than two classes. While binary logistic regression uses the sigmoid function, multiclass problems use the softmax function to produce a probability distribution over all classes.

2.1 Mathematical Model

Assume there are K classes. For a sample with feature vector x, the model learns a weight matrix W and bias vector b, where each class k has its own weight vector w_k and bias b_k. The unnormalized scores (logits) for class k are:

Applying the softmax function converts these scores into class probabilities:

where y denotes the true class label.

2.2 Optimization Objective

As with binary logistic regression, the objective is to minimize the cross‑entropy loss (negative log‑likelihood). For a single sample the loss is:

For the whole dataset the average loss is minimized:

2.3 Applications

Multiclass logistic regression can be used for various multi‑category tasks, such as:

Handwritten digit recognition – classifying digits 0‑9.

News article categorization – labeling articles as politics, sports, entertainment, etc.

Image classification – assigning images to predefined categories.

3 Ordered Logistic Regression

Ordered logistic regression (also known as ordered multinomial logistic or ordered probit) extends logistic regression to handle ordered categorical variables, where the categories have a natural ranking. For example, rating a product as poor, fair, good, or excellent is an ordered classification task. Unlike standard multiclass logistic regression, ordered logistic regression incorporates the order relationship between categories into the prediction.

3.1 Mathematical Model

Assume there are J ordered categories. The model defines a series of thresholds θ_j (θ_1 < θ_2 < … < θ_{J‑1}) and a linear function η = Xβ. The probability of belonging to category j is given by the difference of two sigmoid functions evaluated at the thresholds:

where σ is the sigmoid function.

3.2 Optimization Objective

As with standard logistic regression, the goal is to minimize the negative log‑likelihood loss.

3.3 Applications

Ordered logistic regression is useful in scenarios such as:

Product rating – predicting ordered star ratings.

Health assessment – classifying health status as poor, moderate, or good.

Economic assessment – predicting income level as low, medium, or high.

4 Python Implementation of Multiclass and Ordered Regression

We demonstrate the concepts using scikit‑learn datasets. For multiclass logistic regression we use the digits dataset (handwritten digit images 0‑9). For ordered logistic regression we use the wine dataset, where the target is wine quality, an ordered classification problem.

4.1 Handwritten Digit Recognition (Multiclass)

Steps: load data, split into training and test sets, train a multinomial logistic regression model, evaluate performance.

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report

# 1. Load data
digits = datasets.load_digits()

# 2. Split data
X_train, X_test, y_train, y_test = train_test_split(digits.data, digits.target, test_size=0.2, random_state=823)

# 3. Train multinomial logistic regression
logistic_model = LogisticRegression(max_iter=10000, multi_class="multinomial")
logistic_model.fit(X_train, y_train)

# 4. Evaluate
y_pred = logistic_model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
classification_rep = classification_report(y_test, y_pred)

accuracy, classification_rep

The model achieves 96% accuracy on the test set, with high precision, recall, and F1 scores for each digit.

4.2 Wine Quality Rating (Ordered)

scikit‑learn does not provide a direct ordered logistic regression implementation, but we can simulate it by training a binary logistic regression model for each threshold and combining their predictions.

# 1. Load data
wine = datasets.load_wine()

# 2. Data already has three ordered classes
# 3. Split data
X_train_wine, X_test_wine, y_train_wine, y_test_wine = train_test_split(wine.data, wine.target, test_size=0.2, random_state=823)

# 4. Train binary models for each threshold
models = []
thresholds = [0, 1]  # thresholds for ordered categories
for thresh in thresholds:
    y_train_binary = (y_train_wine > thresh).astype(int)
    model = LogisticRegression(max_iter=10000)
    model.fit(X_train_wine, y_train_binary)
    models.append(model)

# 5. Predict ordered categories
def predict_ordered(models, X):
    probs = [model.predict_proba(X)[:, 1] for model in models]
    predictions = []
    for prob_tuple in zip(*probs):
        if prob_tuple[0] <= 0.5:
            predictions.append(0)
        elif prob_tuple[1] <= 0.5:
            predictions.append(1)
        else:
            predictions.append(2)
    return predictions

y_pred_wine = predict_ordered(models, X_test_wine)
accuracy_wine = accuracy_score(y_test_wine, y_pred_wine)
classification_rep_wine = classification_report(y_test_wine, y_pred_wine)

accuracy_wine, classification_rep_wine

The ordered logistic regression approach achieves over 88% accuracy on the test set, demonstrating good performance with room for further improvement.

Conclusion

Logistic regression, despite its simplicity, remains popular due to its strong classification ability and intuitive interpretability. When problems extend beyond binary classification to multiclass or ordered categories, multiclass and ordered logistic regression provide effective tools. Selecting the appropriate model and optimization method still depends on the specific data and business requirements. This article aims to deepen readers' understanding of these techniques and help them apply the methods in their own projects.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python logistic regression binary classification Scikit-learn multiclass classification ordered regression

Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.