Baidu Geek Talk
Author

Baidu Geek Talk

Follow us to discover more Baidu tech insights.

511
Articles
0
Likes
879
Views
0
Comments
Recent Articles

Latest from Baidu Geek Talk

100 recent articles max
Baidu Geek Talk
Baidu Geek Talk
Feb 19, 2025 · Frontend Development

Technical Practice of Baidu Live‑Streaming Interactive Framework: Performance and Stability Optimization

Baidu live streaming interactive framework optimized performance and stability for music+red‑packet activities, using component reuse, page pre‑static generation, SSR, ISR, prefetching, view prerender, fallback mechanisms, and animation downgrade, achieving first‑screen load reductions to 0.5 s and delivering a reusable solution for large‑scale live events.

Front-end ArchitecturePerformance optimizationSSR
0 likes · 16 min read
Technical Practice of Baidu Live‑Streaming Interactive Framework: Performance and Stability Optimization
Baidu Geek Talk
Baidu Geek Talk
Feb 17, 2025 · Operations

How Baidu Netdisk Prevents Service Avalanches: Dynamic Circuit Breaking & Queue Control

This article analyzes Baidu Netdisk's anti‑avalanche architecture, explaining how avalanche cascades occur in high‑concurrency services and detailing practical prevention, blocking, and mitigation techniques such as dynamic circuit breaking, traffic isolation, request‑validity checks, and socket‑level detection to maintain system reliability.

Circuit BreakingDynamic Throttlingavalanche mitigation
0 likes · 18 min read
How Baidu Netdisk Prevents Service Avalanches: Dynamic Circuit Breaking & Queue Control
Baidu Geek Talk
Baidu Geek Talk
Feb 12, 2025 · Artificial Intelligence

Deploy DeepSeek, Llama, Qwen Models Fast on Baidu Baige AI Heterogeneous Platform

This guide walks you through creating a lightweight compute instance, adding it to Baidu Baige AI heterogeneous computing platform, deploying the vLLM tool, loading and serving small‑scale dense models such as DeepSeek, Llama and Qwen, and provides recommended configuration lists to achieve low‑cost, high‑performance inference.

AI Model DeploymentBaidu BaigeCloud AI
0 likes · 3 min read
Deploy DeepSeek, Llama, Qwen Models Fast on Baidu Baige AI Heterogeneous Platform
Baidu Geek Talk
Baidu Geek Talk
Feb 10, 2025 · Artificial Intelligence

How Baidu Cloud Slashes Inference Costs: DeepSeek Model Optimizations Unveiled

Baidu Cloud's Qianfan platform launched DeepSeek‑R1 and DeepSeek‑V3 with ultra‑low inference pricing, leveraging advanced engine performance tweaks, a split Prefill/Decode architecture, and comprehensive security measures that together boost throughput, cut costs, and ensure enterprise‑grade reliability.

AI inferenceBaidu CloudModel Serving
0 likes · 5 min read
How Baidu Cloud Slashes Inference Costs: DeepSeek Model Optimizations Unveiled
Baidu Geek Talk
Baidu Geek Talk
Feb 5, 2025 · Artificial Intelligence

How to Unlock Full GPU Efficiency for Enterprise AI Platforms

This article analyzes common GPU efficiency problems in enterprise AI compute platforms—such as low utilization, long fault‑resolution times, and limited performance gains—and presents three practical solutions: dynamic resource allocation, systematic fault‑tolerance, and system‑level tuning, illustrated with real‑world case studies.

AI PlatformGPU utilizationlarge model training
0 likes · 11 min read
How to Unlock Full GPU Efficiency for Enterprise AI Platforms
Baidu Geek Talk
Baidu Geek Talk
Jan 22, 2025 · Mobile Development

iOS Sandbox Disk Management and Cleaning Strategies

The article explains iOS sandbox storage by detailing the four main directories, their backup rules, naming conventions, and retrieval APIs, then outlines how to calculate physical file size and implements both automatic quota‑based and manual user‑driven cleaning methods, including system cache removal for tmp, WKWebView, and dyld caches.

Cache CleaningObjective‑Cdisk-management
0 likes · 22 min read
iOS Sandbox Disk Management and Cleaning Strategies
Baidu Geek Talk
Baidu Geek Talk
Jan 20, 2025 · Industry Insights

How Baidu’s Qianfan AppBuilder Is Redefining AI‑Native App Development

The interview explores how Baidu Cloud's Qianfan AppBuilder platform evolves from traditional coding to AI‑native low‑code development, detailing the impact of large‑model agents, Retrieval‑Augmented Generation, security, multimodal support, and future roadmap on enterprise productivity and digital transformation.

AI agentsAI native appsEnterprise AI
0 likes · 18 min read
How Baidu’s Qianfan AppBuilder Is Redefining AI‑Native App Development
Baidu Geek Talk
Baidu Geek Talk
Jan 15, 2025 · Artificial Intelligence

Understanding Large Model Inference Engines and Reducing Token Interval (TPOT)

Large‑model inference engines convert prompts into responses via a Prefill stage and an autoregressive Decoder, measured by TTFT and TPOT, and Baidu’s AIAK suite improves TPOT by separating tokenization, using static slot scheduling, and asynchronous execution, cutting token‑interval latency from ~35 ms to ~14 ms and boosting GPU utilization to about 75 % while also leveraging quantization and speculative execution for higher throughput.

AI accelerationGPU utilizationTPOT
0 likes · 10 min read
Understanding Large Model Inference Engines and Reducing Token Interval (TPOT)
Baidu Geek Talk
Baidu Geek Talk
Jan 13, 2025 · Industry Insights

Top 12 Must-Read Baidu Tech Articles of 2024: Insights & Innovations

This roundup highlights twelve standout Baidu Geek articles from 2024, covering breakthroughs in search personalization, high‑performance Go services, transaction reconciliation, login system evolution, AI‑native applications, microservice governance, caching algorithms, RLHF optimization, ClickHouse deployment, and more, each with concise recommendation reasons.

2024AIBaidu
0 likes · 8 min read
Top 12 Must-Read Baidu Tech Articles of 2024: Insights & Innovations
Baidu Geek Talk
Baidu Geek Talk
Jan 8, 2025 · Artificial Intelligence

Evolution of Video Search Ranking Architecture Towards an End‑to‑End Large‑Model Framework

The article outlines how video search ranking has shifted from a tightly‑coupled multi‑stage cascade to an extensible, end‑to‑end, model‑centric framework called Rankflow, leveraging large‑model inference, decoupled recall, fine‑grained parallelism, and elastic compute allocation to boost performance, flexibility, and maintainability while paving the way for future retrieval‑augmented generation integration.

AIParallel Computingelastic resources
0 likes · 11 min read
Evolution of Video Search Ranking Architecture Towards an End‑to‑End Large‑Model Framework