Artificial Intelligence 7 min read

Accelerating Gradient Boosting with CatBoost

This article explains how CatBoost implements gradient boosting, handles categorical features without preprocessing, lists its key advantages, details common training parameters, and provides a step‑by‑step regression example with code for fitting, cross‑validation, grid search, tree visualization, and parameter inspection.

Code DAO

Dec 18, 2021

Accelerating Gradient Boosting with CatBoost

In gradient boosting, predictions are made by a set of weak learners, and trees are added sequentially without altering previous ones; the result of the previous tree guides the next. The article introduces CatBoost, a gradient‑boosting library developed by Yandex that grows balanced decision trees using the same features for left‑right splits at each level.

CatBoost processes categorical features directly, eliminating the need for one‑hot or label encoding. Users simply specify categorical columns via the cat_features parameter, which improves training speed and prediction quality.

Advantages of using CatBoost

Supports multi‑GPU training.

Provides strong out‑of‑the‑box results with default parameters, reducing tuning time.

Reduces overfitting and improves accuracy.

Offers fast inference.

Trained models can be exported to Core ML for on‑device inference on iOS.

Handles missing values internally.

Applicable to both regression and classification tasks.

Common training parameters loss_function – metric to optimize (e.g., MSE for regression, logloss for classification). eval_metric – metric for overfitting detection. iterations (alias num_boost_round, n_estimators, num_trees) – maximum number of trees (default 1000). learning_rate (alias eta) – step size (default 0.03). random_seed (alias random_state) – random seed for reproducibility. l2_leaf_reg (alias reg_lambda) – L2 regularization coefficient (default 3.0). bootstrap_type – sampling method for object weights (e.g., Bayesian, Bernoulli, MVS, Poisson). depth – depth of each tree. grow_policy – tree growth algorithm (SymmetricTree, Depthwise, Lossguide) with SymmetricTree as default. min_data_in_leaf (alias min_child_samples) – minimum samples per leaf (used with Depthwise or Lossguide). max_leaves (alias num_leaves) – maximum number of leaves (Lossguide only). ignore_features – features to ignore during training. nan_mode – handling of missing values (Forbidden, Min, Max). Leaf_estimation_method – method for leaf value calculation (Newton iterations). Leaf_estimation_backtracking – backtracking type during gradient descent (AnyImprovement, Armijo). boosting_type – boosting scheme (ordered or classic). score_function – split scoring metric (Cosine default, L2, NewtonL2, NewtonCosine). early_stopping_rounds – stops training when overfitting is detected. classes_count – number of classes for multiclass problems. task_type – CPU or GPU (CPU default). devices – GPU device IDs for training. cat_features – array of categorical column indices. text_features – array of text column indices for text classification.

Regression example

from catboost import CatBoostRegressor
cat = CatBoostRegressor()
cat.fit(X_train, y_train, verbose=False, plot=True)

Cross‑validation with visualisation:

from catboost import Pool, cv
params = {"iterations": 100, "depth": 2, "loss_function": "RMSE", "verbose": False}
cv_dataset = Pool(data=X_train, label=y_train)
scores = cv(cv_dataset, params, fold_count=2, plot="True")

Grid search with visualisation:

grid = {
    'learning_rate': [0.03, 0.1],
    'depth': [4, 6, 10],
    'l2_leaf_reg': [1, 3, 5, 7, 9]
}
grid_search_result = cat.grid_search(grid, X=X_train, y=y_train, plot=True)

Tree plotting example (splits are made under identical conditions at each level, e.g., value > 0.5):

cat.plot_tree(tree_idx=0)

Printing all model parameters:

for key, value in cat.get_all_params().items():
    print('{}, {}'.format(key, value))

Conclusion

The article presented CatBoost’s benefits and limitations, listed its main training parameters, and demonstrated a simple regression workflow using the scikit‑learn‑compatible API.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

machine learning regression hyperparameter tuning gradient boosting CatBoost

Written by

Code DAO

We deliver AI algorithm tutorials and the latest news, curated by a team of researchers from Peking University, Shanghai Jiao Tong University, Central South University, and leading AI companies such as Huawei, Kuaishou, and SenseTime. Join us in the AI alchemy—making life better!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.