Baidu Geek Talk
Author

Baidu Geek Talk

Follow us to discover more Baidu tech insights.

511
Articles
0
Likes
878
Views
0
Comments
Recent Articles

Latest from Baidu Geek Talk

100 recent articles max
Baidu Geek Talk
Baidu Geek Talk
Feb 2, 2026 · Artificial Intelligence

How Cloud AI Infra Powers the Next Wave of Embodied Intelligence

This article outlines the rapid rise of embodied intelligence, the explosion of Vision‑Language‑Action (VLA) research, and how cloud‑based AI infrastructure—including multi‑level IaaS, data pipelines, dual‑system model designs, and reinforcement‑learning workflows—addresses emerging scaling and deployment challenges.

VLAmultimodal modelsreinforcement learning
0 likes · 13 min read
How Cloud AI Infra Powers the Next Wave of Embodied Intelligence
Baidu Geek Talk
Baidu Geek Talk
Jan 7, 2026 · Artificial Intelligence

How Baidu’s vLLM‑Kunlun Plugin Powered MiMo Flash V2 on Kunlun XPU in 2 Days

Within two days, Baidu’s Baige and Kunlun Chip teams adapted the 309‑billion‑parameter MiMo Flash V2 model—featuring a hybrid SWA+Sink and Full Attention mechanism—to run efficiently on the Kunlun P800 XPU using the vLLM‑Kunlun Plugin, achieving lossless performance comparable to GPU inference.

AI inferenceKunlun XPUMiMo Flash V2
0 likes · 7 min read
How Baidu’s vLLM‑Kunlun Plugin Powered MiMo Flash V2 on Kunlun XPU in 2 Days
Baidu Geek Talk
Baidu Geek Talk
Dec 24, 2025 · Artificial Intelligence

Context Parallelism Slashes TTFT by 80% for 128K-Token LLMs

The article explains how Baidu’s Baige team integrated a Context Parallelism strategy into DeepSeek V3.2, detailing the DSA architecture, the limitations of traditional tensor and sequence parallelism, and how CP distributes computation and memory across GPUs to achieve up to an 80 % reduction in token‑to‑first‑token latency for ultra‑long 128K‑token contexts.

Context ParallelismDeepSeekLLM
0 likes · 9 min read
Context Parallelism Slashes TTFT by 80% for 128K-Token LLMs
Baidu Geek Talk
Baidu Geek Talk
Dec 17, 2025 · Artificial Intelligence

Accelerate LLM Deployment on Baidu Kunlun XPU with the Open‑Source vLLM‑Kunlun Plugin

The vLLM‑Kunlun Plugin, jointly released by Baidu Baige and Kunlun Chip, provides a high‑performance, zero‑intrusion solution for deploying open‑source large language models on domestic Kunlun XPU hardware, includes fused operators, precision‑validation and profiling tools, and supports over twenty mainstream and multimodal models.

Kunlun XPUPerformance optimizationmodel deployment
0 likes · 7 min read
Accelerate LLM Deployment on Baidu Kunlun XPU with the Open‑Source vLLM‑Kunlun Plugin
Baidu Geek Talk
Baidu Geek Talk
Dec 10, 2025 · Artificial Intelligence

How Offloading Latent Cache Boosts DeepSeek‑V3.2‑Exp Decoding Throughput

This report analyzes the memory bottleneck of DeepSeek‑V3.2‑Exp’s sparse‑attention decoder, proposes the Expanded Sparse Server (ESS) to offload the latent cache to CPU memory, and demonstrates through high‑fidelity simulation that the approach dramatically improves decode throughput while keeping latency within acceptable limits.

Cache offloadGPU memoryLLM inference
0 likes · 20 min read
How Offloading Latent Cache Boosts DeepSeek‑V3.2‑Exp Decoding Throughput
Baidu Geek Talk
Baidu Geek Talk
Nov 10, 2025 · Cloud Native

How Polar‑TCP Breaks Kernel Network Bottlenecks for Cloud‑Native High‑Performance Services

This article explains how traditional kernel network stacks struggle with high‑concurrency, low‑latency cloud data‑center workloads and introduces Baidu Intelligent Cloud’s Polar solution—Polar‑TCP and Polar‑RDMA—which combine user‑space DPDK drivers, a lightweight TCP stack, and an industrial RPC framework to achieve near‑RDMA performance while preserving compatibility with existing TCP ecosystems.

DPDKNetwork StackPerformance optimization
0 likes · 23 min read
How Polar‑TCP Breaks Kernel Network Bottlenecks for Cloud‑Native High‑Performance Services
Baidu Geek Talk
Baidu Geek Talk
Nov 5, 2025 · Artificial Intelligence

How AI Agents Are Revolutionizing E‑Commerce and Global Brand Expansion

In a round‑table hosted by Baidu Intelligent Cloud, industry leaders dissect how AI agents are transforming Chinese retail and overseas brand expansion, addressing challenges such as rising traffic costs, low repurchase rates, localization hurdles, and demonstrating concrete use cases in content generation, intelligent customer service, and automated marketing that promise to make AI agents an essential, not optional, component of modern commerce.

AICustomer Servicedigital transformation
0 likes · 17 min read
How AI Agents Are Revolutionizing E‑Commerce and Global Brand Expansion
Baidu Geek Talk
Baidu Geek Talk
Oct 29, 2025 · Artificial Intelligence

How Baidu Transformed E‑commerce Risk Control with Multi‑Modal AI Agents

This article details Baidu's e‑commerce risk‑control overhaul, explaining how traditional rule‑based and manual reviews struggled with multimodal violations, ambiguous semantics, and poor merchant experience, and how a new AI‑driven pipeline combining large multimodal models, rule engines, and knowledge‑base queries achieved full‑automation, real‑time feedback, and high explainability.

AIe-commercerisk control
0 likes · 13 min read
How Baidu Transformed E‑commerce Risk Control with Multi‑Modal AI Agents
Baidu Geek Talk
Baidu Geek Talk
Oct 15, 2025 · Artificial Intelligence

Can LLMs Automate Data Ingestion and Cut Integration Time from Months to Days?

This article presents an LLM‑driven intelligent data platform ingestion solution that automates schema recognition, mapping, quality rule extraction, and package building, reducing integration cycles from three months to three days while eliminating manual effort and enhancing scalability and control.

AIAutomationData Ingestion
0 likes · 21 min read
Can LLMs Automate Data Ingestion and Cut Integration Time from Months to Days?
Baidu Geek Talk
Baidu Geek Talk
Oct 13, 2025 · Big Data

How Baidu Scaled Its Data Warehouse to Handle Billions of PVs and Petabytes

This article details Baidu APP's massive data‑warehouse overhaul, describing the two‑step strategy that stabilized log cleaning, modernized the ETL framework, introduced wide‑table architectures, and implemented tiered storage to dramatically improve processing speed, reliability, and cost efficiency for petabyte‑scale workloads.

Data WarehouseETLPerformance optimization
0 likes · 25 min read
How Baidu Scaled Its Data Warehouse to Handle Billions of PVs and Petabytes