Artificial Intelligence 5 min read

Boost Your Models with LightGBM: Fast, Accurate Gradient Boosting in Python

This article introduces LightGBM, a high‑performance gradient boosting framework, explains its advantages over XGBoost, and provides step‑by‑step Python code for building classification and regression models on the Iris dataset, including model training, evaluation, and visualizing feature importance and tree structures.

Model Perspective
Model Perspective
Model Perspective
Boost Your Models with LightGBM: Fast, Accurate Gradient Boosting in Python

Introduction

At the end of 2016, Microsoft’s DMTK team open‑sourced LightGBM on GitHub, quickly gaining over 1,000 stars and 200 forks, demonstrating its popularity.

Gradient Boosting Decision Tree (GBDT) remains a long‑standing model in machine learning, using weak learners (decision trees) iteratively to produce a strong model with good training performance and resistance to over‑fitting. GBDT is widely used in industry for click‑through‑rate prediction, search ranking, and dominates many Kaggle competition solutions.

LightGBM (Light Gradient Boosting Machine) is a lightweight framework implementing the GBDT algorithm, offering high‑efficiency parallel training and several advantages:

Faster training speed

Lower memory consumption

Higher accuracy

Support for parallel learning

Ability to handle large‑scale data

Python Implementation

Classification Model

Training the model

<code>import lightgbm as gbm
from sklearn.datasets import load_iris
from matplotlib import pyplot as plt
from sklearn.model_selection import train_test_split

# read in the iris data
iris = load_iris()

X = iris.data
y = iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

# train the model
model = gbm.LGBMClassifier(max_depth=5, learning_rate=0.1, n_estimators=160, silent=True, objective='multi:softmax')
model.fit(X_train, y_train)

# predict on the test set
ans = model.predict(X_test)

model.score(X_test, y_test)</code>

Plotting feature importance

<code>plt.figure()
gbm.plot_importance(model)
plt.show()</code>

Plotting the tree

<code>plt.figure()
gbm.plot_tree(model)
plt.show()</code>

Regression Model

Training the model

<code>import lightgbm as gbm
from sklearn.datasets import load_iris
from matplotlib import pyplot as plt
from sklearn.model_selection import train_test_split

# read in the iris data
iris = load_iris()

X = iris.data[:, :3]
y = iris.data[:, 3]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

# train the model
model = gbm.LGBMRegressor(max_depth=5, learning_rate=0.1, n_estimators=160, silent=True)
model.fit(X_train, y_train)

# predict on the test set
ans = model.predict(X_test)

model.score(X_test, y_test)</code>

Plotting feature importance

<code>plt.figure()
gbm.plot_importance(model)
plt.show()</code>

Plotting the tree

<code>plt.figure()
gbm.plot_tree(model)
plt.show()</code>

Reference

https://blog.csdn.net/anshuai_aw1/article/details/83659932

machine learningPythonregressionclassificationLightGBMGradient Boosting
Model Perspective
Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.