Artificial Intelligence 10 min read

10 Common Loss Functions and Their Python Implementations

This article explains ten widely used loss functions for regression and classification tasks, describes their mathematical definitions, compares their purposes, and provides complete Python code examples for each, helping readers understand how to select and implement appropriate loss metrics in machine‑learning models.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
10 Common Loss Functions and Their Python Implementations

This article provides an in‑depth overview of ten commonly used loss functions in machine learning, explaining their definitions, typical use cases, and offering Python implementations for each.

What is a loss function?

A loss function measures the discrepancy between a model's predictions and the true values; lower values indicate better predictions. The average of individual losses across all samples is called the cost function.

Loss functions vs. evaluation metrics

While some loss functions can serve as evaluation metrics, loss functions are primarily used during model training to guide optimization, whereas metrics assess final model performance.

Why use loss functions?

Loss functions quantify prediction errors, enabling gradient‑based optimization to adjust model parameters toward better performance.

Regression loss functions

1. Mean Squared Error (MSE)

Computes the average of squared differences between predicted and true values.

def MSE(y, y_predicted):
    sq_error = (y_predicted - y) ** 2
    sum_sq_error = np.sum(sq_error)
    mse = sum_sq_error / y.size
    return mse

2. Mean Absolute Error (MAE)

Calculates the average absolute difference, which is more robust to outliers.

def MAE(y, y_predicted):
    error = y_predicted - y
    absolute_error = np.absolute(error)
    total_absolute_error = np.sum(absolute_error)
    mae = total_absolute_error / y.size
    return mae

3. Root Mean Squared Error (RMSE)

The square root of MSE, useful when the error scale should match the original units.

def RMSE(y, y_predicted):
    sq_error = (y_predicted - y) ** 2
    total_sq_error = np.sum(sq_error)
    mse = total_sq_error / y.size
    rmse = math.sqrt(mse)
    return rmse

4. Mean Bias Error (MBE)

Similar to MAE but retains the sign of the error, indicating systematic over‑ or under‑prediction.

def MBE(y, y_predicted):
    error = y_predicted - y
    total_error = np.sum(error)
    mbe = total_error / y.size
    return mbe

5. Huber Loss

Combines MAE and MSE, using a quadratic loss for small errors and linear loss for large errors.

def hubber_loss(y, y_predicted, delta):
    delta = 1.35 * MAE(y, y_predicted)
    y_size = y.size
    total_error = 0
    for i in range(y_size):
        error = np.absolute(y_predicted[i] - y[i])
        if error < delta:
            hubber_error = (error * error) / 2
        else:
            hubber_error = (delta * error) / (0.5 * (delta * delta))
        total_error += hubber_error
    total_hubber_error = total_error / y.size
    return total_hubber_error

Binary classification loss functions

6. Likelihood Loss (LHL)

Multiplies predicted probabilities for the true class and averages them.

def LHL(y, y_predicted):
    likelihood_loss = (y * y_predicted) + ((1 - y) * (y_predicted))
    total_likelihood_loss = np.sum(likelihood_loss)
    lhl = - total_likelihood_loss / y.size
    return lhl

7. Binary Cross‑Entropy (BCE)

Penalizes confident but wrong predictions by applying the logarithm to predicted probabilities.

def BCE(y, y_predicted):
    ce_loss = y * np.log(y_predicted) + (1 - y) * np.log(1 - y_predicted)
    total_ce = np.sum(ce_loss)
    bce = - total_ce / y.size
    return bce

8. Hinge Loss and Squared Hinge Loss

Used for support‑vector machines; penalizes predictions that are on the wrong side of the margin.

# Hinge Loss
def Hinge(y, y_predicted):
    hinge_loss = np.sum(np.maximum(0, 1 - (y_predicted * y)))
    return hinge_loss

# Squared Hinge Loss
def SqHinge(y, y_predicted):
    sq_hinge_loss = np.maximum(0, 1 - (y_predicted * y)) ** 2
    total_sq_hinge_loss = np.sum(sq_hinge_loss)
    return total_sq_hinge_loss

Multiclass classification loss functions

9. Categorical Cross‑Entropy (CE)

Generalizes binary cross‑entropy to multiple classes.

def CCE(y, y_predicted):
    cce_class = y * np.log(y_predicted)
    sum_totalpair_cce = np.sum(cce_class)
    cce = - sum_totalpair_cce / y.size
    return cce

10. Kullback‑Leibler Divergence (KLD)

Measures how one probability distribution diverges from a reference distribution, useful for imbalanced classes.

def KL(y, y_predicted):
    kl = y * np.log(y / y_predicted)
    total_kl = np.sum(kl)
    return total_kl

These ten loss functions cover the most common scenarios in regression and classification, providing both theoretical insight and ready‑to‑use Python code.

Machine LearningAIregressionclassificationloss functions
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.