Artificial Intelligence 5 min read

Boost Your Models with LightGBM: Fast, Accurate Gradient Boosting in Python

This article introduces LightGBM, a high‑performance gradient boosting framework, explains its advantages over XGBoost, and provides step‑by‑step Python code for building classification and regression models on the Iris dataset, including model training, evaluation, and visualizing feature importance and tree structures.

Model Perspective

Oct 1, 2022

Boost Your Models with LightGBM: Fast, Accurate Gradient Boosting in Python

Introduction

At the end of 2016, Microsoft’s DMTK team open‑sourced LightGBM on GitHub, quickly gaining over 1,000 stars and 200 forks, demonstrating its popularity.

Gradient Boosting Decision Tree (GBDT) remains a long‑standing model in machine learning, using weak learners (decision trees) iteratively to produce a strong model with good training performance and resistance to over‑fitting. GBDT is widely used in industry for click‑through‑rate prediction, search ranking, and dominates many Kaggle competition solutions.

LightGBM (Light Gradient Boosting Machine) is a lightweight framework implementing the GBDT algorithm, offering high‑efficiency parallel training and several advantages:

Faster training speed

Lower memory consumption

Higher accuracy

Support for parallel learning

Ability to handle large‑scale data

Python Implementation

Classification Model

Training the model

import lightgbm as gbm
from sklearn.datasets import load_iris
from matplotlib import pyplot as plt
from sklearn.model_selection import train_test_split

# read in the iris data
iris = load_iris()

X = iris.data
y = iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

# train the model
model = gbm.LGBMClassifier(max_depth=5, learning_rate=0.1, n_estimators=160, silent=True, objective='multi:softmax')
model.fit(X_train, y_train)

# predict on the test set
ans = model.predict(X_test)

model.score(X_test, y_test)

Plotting feature importance

plt.figure()
gbm.plot_importance(model)
plt.show()

Plotting the tree

plt.figure()
gbm.plot_tree(model)
plt.show()

Regression Model

Training the model

import lightgbm as gbm
from sklearn.datasets import load_iris
from matplotlib import pyplot as plt
from sklearn.model_selection import train_test_split

# read in the iris data
iris = load_iris()

X = iris.data[:, :3]
y = iris.data[:, 3]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

# train the model
model = gbm.LGBMRegressor(max_depth=5, learning_rate=0.1, n_estimators=160, silent=True)
model.fit(X_train, y_train)

# predict on the test set
ans = model.predict(X_test)

model.score(X_test, y_test)

Plotting feature importance

plt.figure()
gbm.plot_importance(model)
plt.show()

Plotting the tree

plt.figure()
gbm.plot_tree(model)
plt.show()

Reference

https://blog.csdn.net/anshuai_aw1/article/details/83659932

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python classification LightGBM gradient boosting

Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.