Feb 10, 2025 · Artificial Intelligence

Hybrid Cloud Elastic LLM Inference Solution with ACK Edge and KServe

This article presents a hybrid‑cloud solution that uses ACK Edge and KServe to dynamically allocate on‑premise and cloud GPU resources for large‑language‑model inference, addressing tidal traffic patterns, reducing costs, and ensuring high availability through elastic scaling and custom scheduling policies.

ACK@EdgeAuto ScalingHybrid Cloud

0 likes · 13 min read

Hybrid Cloud Elastic LLM Inference Solution with ACK Edge and KServe

Didi Tech

Aug 17, 2019 · Artificial Intelligence

Didi’s Elastic Inference Service & IFX Engine: Achieving World‑Class AI Inference

Didi’s Elastic Inference Service (EIS) and its IFX AI acceleration engine provide a distributed, cost‑effective inference platform that automatically scales resources based on QPS and latency requirements, supports major deep‑learning frameworks, excels in public‑cloud, private‑cloud, IoT and edge scenarios, and achieved top‑rank DAWNBench latency and cost scores on ImageNet with P4 GPUs.

AI inferenceBenchmarkDeep Learning

0 likes · 7 min read

Didi’s Elastic Inference Service & IFX Engine: Achieving World‑Class AI Inference

elastic inference

Hybrid Cloud Elastic LLM Inference Solution with ACK Edge and KServe

Didi’s Elastic Inference Service & IFX Engine: Achieving World‑Class AI Inference