Artificial Intelligence 4 min read

Optimizing AI Platform Resource Efficiency: Scheduling Strategies for Deep Learning Inference and Training

The article outlines a technical exchange hosted by 58.com AI Lab and Tianjin University that discusses high‑efficiency AI computing, resource‑aware scheduling for both online inference and offline training, and methods to mitigate GPU under‑utilization and gray‑interference in distributed deep‑learning platforms.

DataFunTalk
DataFunTalk
DataFunTalk
Optimizing AI Platform Resource Efficiency: Scheduling Strategies for Deep Learning Inference and Training

Rapid advances in artificial intelligence have dramatically increased the demand for high‑efficiency intelligent computing systems; within 58.com, many services now rely on deep‑learning models, leading to typical characteristics such as peak‑valley online inference loads, low GPU utilization during off‑peak periods, and resource contention among offline training clusters.

On November 3, 58.com AI Lab together with Tianjin University’s School of Intelligent Computing co‑hosted a technical exchange focusing on two aspects: efficient cluster resource scheduling and fine‑grained mixed online‑offline workload management, aiming to improve deep‑learning inference services and training job performance.

Schedule Introduction

Topic Analysis & Audience Benefits

Offline Training Job Resource Scheduling Optimization New techniques: (1) priority‑based scheduling for offline training tasks; (2) resource‑usage prediction and adjustment for offline jobs. Audience will learn how to apply these strategies to boost offline training resource utilization.

High‑Throughput Distributed Training Cluster Scheduling Based on Task Predictability New techniques: (1) dynamic resource scheduling for predictable tasks; (2) unified priority scheduling for mixed workloads. Audience will understand task predictability definitions, heterogeneous resource scheduling strategies, and unified priority‑based scheduling policies.

Mixed Online Inference and Offline Training in Deep‑Learning Platform New techniques: (1) automatic elastic scaling for inference services; (2) dynamic resource scheduling for mixed online‑offline workloads. Audience will gain insight into elastic scaling solutions for model inference and the implementation of mixed deployment for offline jobs and online services.

Gray Interference Research and Application Mixing in Distributed Micro‑Service Scenarios New techniques: (1) spatio‑temporal encoding for service performance and interference prediction; (2) fine‑grained application mixing at the micro‑service component level. Audience will learn about the “gray interference” phenomenon in cloud services and how fine‑grained resource management can improve system efficiency.

AIdeep learningResource SchedulingGPU utilizationInferenceTraining
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.