Design and Practice of Turing OS: An Online Service Framework for Machine Learning and Deep Learning at Meituan

Meituan’s Turing OS unifies the end‑to‑end machine‑learning lifecycle—data preprocessing, feature generation, model training, deployment, online prediction and A/B testing—through a lightweight SDK, plugin‑based algorithms, DAG orchestration, sandbox validation and replay tools, cutting algorithm iteration from days to hours while handling billions of daily predictions.

Meituan Technology Team
Meituan Technology Team
Meituan Technology Team
Design and Practice of Turing OS: An Online Service Framework for Machine Learning and Deep Learning at Meituan

The article introduces the Turing platform built by Meituan Delivery's technology team, a one‑stop algorithm platform that provides end‑to‑end services from data preprocessing, feature generation, model training, evaluation, deployment, online prediction, A/B testing, and algorithm effect assessment. The online service framework, Turing OS, focuses on machine‑learning and deep‑learning online services, offering a unified solution for deploying models and algorithm strategies at scale.

Background : Early in Meituan Delivery, each business line independently developed online prediction components (the “chimney model”), leading to duplicated effort, lack of platform capabilities, and tight coupling between algorithms and engineering. As the business grew, these issues became bottlenecks, prompting the creation of a unified online service framework.

Turing OS 1.0 : Integrated model computation with preprocessing and post‑processing via an SDK. It solved wheel‑reinvention, simplified development, and reduced coupling, but still suffered from high coupling among algorithm packages, business services, and the platform, as well as deployment‑related pain points.

Turing OS 2.0 addresses the shortcomings of 1.0 with several key features:

Standardized lightweight SDK : A minimal SDK that hides routing, feature fetching, and model execution, lowering integration effort.

Algorithm pluginization : Algorithms are packaged as plugins and hot‑deployed in Turing OS via custom class loaders, enabling version‑independent releases.

Data channel : Allows algorithms to fetch required data autonomously, reducing the need for business services to mediate data flow.

Algorithm orchestration (DAG) : Operators (model‑compute or algorithm‑compute) can be composed serially or in parallel, forming directed acyclic graphs for complex workflows.

Multi‑mode integration : Supports Standalone (service‑level deployment) and Embedded (in‑process deployment) modes, each with trade‑offs in coupling and performance.

Turing sandbox : An isolated environment mirroring production for safe pre‑release validation, traffic diversion, and performance stress testing.

Unified replay platform : Records inputs, outputs, features, and models using Elasticsearch and HBase, enabling offline replay and debugging.

Performance testing & optimization : Leverages Meituan’s Quake full‑link testing system to generate traffic from replay data, perform stress tests, and produce diagnostic reports.

Results : Over 20 Turing OS clusters have been deployed, supporting 25+ algorithm packages, 50+ algorithms, and processing billions of online predictions daily. Algorithm iteration cycles have been reduced from days to hours, with most steps fully automated and independent of engineering or testing teams.

Future outlook : Plans include automated operation and testing tools, deeper integration with Spark for large‑scale validation, and a graphical workflow engine to further accelerate algorithm development and deployment.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Model Deploymentscalable architectureAlgorithm Platformonline serving
Meituan Technology Team
Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.