How to Accelerate XGBoost Training with Tree Methods, Cloud Computing, and Ray
The article explains why XGBoost training can be slow despite its speed focus and presents three acceleration techniques—choosing an optimal tree_method, leveraging cloud resources for larger memory, and using Ray for distributed training—complete with code examples and benchmark results.
Gradient boosting is widely used for supervised learning, and XGBoost is the open‑source implementation optimized for speed, yet training can still be slow.
Changing the tree construction method
XGBoost’s tree_method parameter selects the algorithm used to build trees ( exact, approx, hist, gpu_hist, auto). Choosing the appropriate method for the task can significantly reduce training time. The article shows a benchmark where switching from hist to gpu_hist roughly halves the runtime.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from xgboost import XGBClassifier
import time
X, y = make_classification(n_samples=100000, n_features=1000,
n_informative=50, n_redundant=0,
random_state=1)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.50, random_state=1)
evalset = [(X_train, y_train), (X_test, y_test)]
results = []
methods = ['exact', 'approx', 'hist', 'gpu_hist', 'auto']
for method in methods:
model = XGBClassifier(learning_rate=0.02,
n_estimators=50,
objective="binary:logistic",
use_label_encoder=False,
tree_method=method)
start = time.time()
model.fit(X_train, y_train, eval_metric='logloss', eval_set=evalset)
end = time.time()
results.append(method + " Fit Time: " + str(end-start))
print(results)If the operating system lacks native GPU support, the gpu_hist option should be omitted.
Leveraging cloud computing
Running XGBoost on cloud instances provides access to larger memory pools, which can accommodate bigger datasets and reduce paging overhead.
Distributed XGBoost on Ray
Ray is a distributed framework that also offers a machine‑learning library. XGBoost‑Ray extends the native XGBoost API, allowing a single‑node script to scale to hundreds of nodes with multiple GPUs. The gradient exchange uses NCCL2, while inter‑node coordination relies on Rabit.
from xgboost_ray import RayXGBClassifier, RayParams
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
seed = 42
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(
X, y, train_size=0.25, random_state=seed)
clf = RayXGBClassifier(n_jobs=4, random_state=seed)
clf.fit(X_train, y_train)
pred_ray = clf.predict(X_test)
print(pred_ray)
pred_proba_ray = clf.predict_proba(X_test)
print(pred_proba_ray)The example demonstrates that only minimal code changes are required to switch from local XGBoost to distributed training.
In summary, the article presents three ways to speed up XGBoost training: selecting an optimal tree_method, using cloud resources for larger memory, and employing Ray for distributed execution.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Code DAO
We deliver AI algorithm tutorials and the latest news, curated by a team of researchers from Peking University, Shanghai Jiao Tong University, Central South University, and leading AI companies such as Huawei, Kuaishou, and SenseTime. Join us in the AI alchemy—making life better!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
