Tagged articles

Performance Optimization

1973 articles · Page 1 of 20
IT Learning Made Simple
IT Learning Made Simple
Jul 4, 2026 · Game Development

The IT Architecture Behind the Viral Mini‑Game “Cupping” and Its Stress‑Relief Appeal

The article explains how the new WeChat mini‑game “Cupping” combines a simple, stress‑relieving match‑3 mechanic with a three‑layer architecture—native engine, JavaScript game logic, and resource rendering—leveraging lightweight packaging, cross‑device rendering, performance optimizations and cloud storage to deliver instant, smooth play on any device.

Cloud ServicesCross-PlatformGame Architecture
0 likes · 8 min read
The IT Architecture Behind the Viral Mini‑Game “Cupping” and Its Stress‑Relief Appeal
JD Cloud Developers
JD Cloud Developers
Jun 25, 2026 · Artificial Intelligence

JD Donates Oxygen xLLM: Open‑Source Large‑Model Inference Engine Boosts China’s AI Infrastructure

JD announced the donation of its Oxygen xLLM inference engine to the OpenAtom Open‑Source Foundation, detailing its service‑engine decoupled architecture, performance breakthroughs across e‑commerce, power and public‑safety workloads, and a roadmap to expand the open‑source AI ecosystem.

AI InfrastructureOxygen xLLMPerformance Optimization
0 likes · 8 min read
JD Donates Oxygen xLLM: Open‑Source Large‑Model Inference Engine Boosts China’s AI Infrastructure
JD Tech Talk
JD Tech Talk
Jun 25, 2026 · Artificial Intelligence

JD Donates Oxygen xLLM Inference Engine to OpenAtom, Boosting China’s AI Infra Ecosystem

On June 24, 2026 JD announced the donation of its Oxygen xLLM large‑model inference engine to the OpenAtom Open Source Foundation, detailing its service‑engine decoupled architecture, performance breakthroughs, heterogeneous chip support, and real‑world gains in e‑commerce, power‑grid and public‑safety applications while outlining a roadmap for broader ecosystem co‑building and standards leadership.

AI InfrastructureOxygen xLLMPerformance Optimization
0 likes · 7 min read
JD Donates Oxygen xLLM Inference Engine to OpenAtom, Boosting China’s AI Infra Ecosystem
Raymond Ops
Raymond Ops
Jun 25, 2026 · Operations

Linux Kernel Sysctl Tuning: Common Pitfalls and Values You Shouldn’t Change Blindly

This guide explains how to safely tune Linux kernel sysctl parameters by first identifying the problem layer, backing up current settings, applying targeted changes, and verifying effects, while highlighting common mis‑configurations, real‑world case studies, best‑practice recommendations, and monitoring strategies.

LinuxMemory ManagementMonitoring
0 likes · 18 min read
Linux Kernel Sysctl Tuning: Common Pitfalls and Values You Shouldn’t Change Blindly
Spring Full-Stack Practical Cases
Spring Full-Stack Practical Cases
Jun 24, 2026 · Backend Development

Ditch Traditional JSON Parsing: Boost Spring Boot API Performance by 30×

A four‑month investigation revealed that Jackson’s default object‑mapper consumed over 60% of CPU time during order‑submission requests, causing 900 ms latency; switching to Jackson’s streaming API reduced average response time from 912 ms to 28 ms, cut GC pauses, and increased throughput eight‑fold, while introducing readability and validation trade‑offs.

JSON parsingJavaPerformance Optimization
0 likes · 8 min read
Ditch Traditional JSON Parsing: Boost Spring Boot API Performance by 30×
dbaplus Community
dbaplus Community
Jun 22, 2026 · Operations

Why Switching Linux Page Size from 4KB to 2MB Can Crash Your Performance

The article explains that blindly replacing Linux's default 4KB pages with 2MB hugepages can dramatically increase memory usage, cause cache conflicts and page‑fault latency, and ultimately degrade the performance of micro‑service workloads despite improving TLB hit rates.

HugePagesLinuxMemory Management
0 likes · 19 min read
Why Switching Linux Page Size from 4KB to 2MB Can Crash Your Performance
Deepin Linux
Deepin Linux
Jun 22, 2026 · Backend Development

Memory Pool vs Object Pool: When to Choose and How to Build One from Scratch

The article explains why high‑concurrency programs suffer from memory fragmentation and system‑call overhead, compares memory pools and object pools, outlines their distinct use‑cases, provides step‑by‑step C and C++ implementations, and highlights optimization tips and common pitfalls.

C#High concurrencyObject Pool
0 likes · 20 min read
Memory Pool vs Object Pool: When to Choose and How to Build One from Scratch
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Jun 20, 2026 · Artificial Intelligence

How I Burned $15K on Claude Code in a Month and Finally Mastered Skill Writing

After spending nearly $15,000 on Claude Code and Codex in a single month, the author discovered that most of his dozens of skills were never invoked, learned the progressive‑disclosure mechanism, rewrote skill descriptions, added verification steps, organized skills as folders with scripts and hooks, and now knows how to identify and optimize the truly useful skills.

AI AgentsClaude CodePerformance Optimization
0 likes · 19 min read
How I Burned $15K on Claude Code in a Month and Finally Mastered Skill Writing
Deepin Linux
Deepin Linux
Jun 20, 2026 · Fundamentals

Why Using Pipes Can Max Out Your CPU: Hidden Costs and Fixes

Although Linux pipes avoid disk I/O and seem faster, misuse such as tiny frequent writes, mismatched read/write speeds, non‑blocking tight loops, and improper fd handling can drive a single core to 100 % CPU, but the article explains the underlying reasons and step‑by‑step optimizations to prevent it.

CPU usageIPCLinux
0 likes · 17 min read
Why Using Pipes Can Max Out Your CPU: Hidden Costs and Fixes
Spring Full-Stack Practical Cases
Spring Full-Stack Practical Cases
Jun 19, 2026 · Backend Development

Java Pooling Under High Concurrency: Resource Reuse and Performance Optimization

The article explains Java pooling techniques for high‑concurrency scenarios, introduces Apache Commons Pool 2, demonstrates how to configure dependencies, implement a PooledObjectFactory, create custom eviction policies and statistics, and shows a complete runnable example that highlights resource reuse and performance gains.

JavaPerformance OptimizationSpring Boot
0 likes · 8 min read
Java Pooling Under High Concurrency: Resource Reuse and Performance Optimization
Sohu Tech Products
Sohu Tech Products
Jun 17, 2026 · Fundamentals

Kotlin Inline Functions: More Than Just a Performance Trick

This article explains how Kotlin's inline keyword eliminates lambda object allocation and virtual calls, enables non‑local returns and reified generics, discusses inline properties and their performance benefits, and outlines scenarios where inlining can backfire, helping developers use it wisely.

AndroidKotlinPerformance Optimization
0 likes · 12 min read
Kotlin Inline Functions: More Than Just a Performance Trick
Architect's Guide
Architect's Guide
Jun 16, 2026 · Backend Development

How to Insert 300,000 Records in 13 Seconds with MyBatis and JDBC

The article compares several ways of inserting 300,000 MySQL rows—single‑row loops, an un‑batched MyBatis attempt that hits the max_allowed_packet limit, and a tuned batch strategy that commits every 1,000 rows—showing how the optimized batch reduces the runtime from hours to just 13 seconds and summarizing best‑practice tips.

Batch InsertJDBCJava
0 likes · 13 min read
How to Insert 300,000 Records in 13 Seconds with MyBatis and JDBC
Kuaishou Tech
Kuaishou Tech
Jun 10, 2026 · Mobile Development

How Kuaishou Scaled HarmonyOS: Technical Practices Unveiled at HDC 2026

The article outlines Kuaishou's seven technical sessions at HDC 2026, detailing solutions for HarmonyOS large‑scale deployment such as startup performance, HD streaming, memory‑leak mitigation, cross‑platform framework adaptation, KMP integration, ArkUI optimization, and AI‑native enhancements.

AI integrationArkUICross‑Platform Development
0 likes · 9 min read
How Kuaishou Scaled HarmonyOS: Technical Practices Unveiled at HDC 2026
JD Retail Technology
JD Retail Technology
Jun 8, 2026 · Mobile Development

Accelerating Taro Native Static Layout Rendering on HarmonyOS

The article analyzes severe scroll jank on low‑end HarmonyOS devices caused by Taro Native's heavyweight card page, identifies main‑thread overload in layout phases 1, 4 and 5, proposes static node‑tree layout with custom measurement interception and font‑measurement caching, and reports a frame‑rate boost from 43 fps to 57 fps (~32.5% improvement).

CustomNodeHarmonyOSNODE_LAYOUT_RECT
0 likes · 8 min read
Accelerating Taro Native Static Layout Rendering on HarmonyOS
IT Learning Made Simple
IT Learning Made Simple
Jun 8, 2026 · R&D Management

The Essential Gear to Become a Software Architect

This guide maps the complete skill tree for aspiring software architects, detailing foundational knowledge, core competencies such as system design and performance tuning, extended expertise in cloud‑native and big‑data technologies, and a staged learning roadmap to help newcomers acquire the necessary gear.

Big DataCloud NativePerformance Optimization
0 likes · 9 min read
The Essential Gear to Become a Software Architect
IT Services Circle
IT Services Circle
Jun 7, 2026 · Fundamentals

Why Switching Linux Page Size to 2 MiB Can Skyrocket Performance

The article explains how the default 4 KiB pages cause frequent TLB misses, how using 2 MiB huge pages expands a single TLB entry’s coverage by 512×, reduces page‑walk depth and page‑table overhead, and provides C++ examples for both hugetlbfs and Transparent Huge Pages.

C#Huge PagesLinux
0 likes · 7 min read
Why Switching Linux Page Size to 2 MiB Can Skyrocket Performance
360 Smart Cloud
360 Smart Cloud
Jun 2, 2026 · Databases

Valkey 9.1.0 Launches a New Era of AI‑Optimized In‑Memory Storage

Valkey 9.1.0 replaces Redis 7.2 with multi‑threaded networking, redesigned hash tables, and AI‑focused features, delivering up to 230% higher throughput, 20%+ memory savings, open BSD‑3‑Clause governance, and seamless compatibility with existing Redis ecosystems for high‑concurrency and AI workloads.

AI cachingIn-Memory DatabaseKV store
0 likes · 10 min read
Valkey 9.1.0 Launches a New Era of AI‑Optimized In‑Memory Storage
Architect Chen
Architect Chen
Jun 2, 2026 · Backend Development

Unlock 10× Faster Responses: Inside Nginx’s Caching Mechanism

The article explains how Nginx’s two‑layer caching—browser and proxy—works, why it can reduce backend load and latency, often delivering more than tenfold performance gains for read‑heavy static content, and provides detailed configuration directives such as proxy_cache_path, proxy_cache, proxy_cache_valid, and best‑practice settings to ensure cache validity and avoid cache stampede.

CachingConfigurationNGINX
0 likes · 5 min read
Unlock 10× Faster Responses: Inside Nginx’s Caching Mechanism
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Jun 2, 2026 · Artificial Intelligence

Halving Training Time: LoongForge Full‑Stack Optimizations Boost GR00T N1.6 Throughput 2.3×

LoongForge applies system‑level optimizations—async data prefetch, fine‑grained communication‑compute overlap via a Megatron distributed optimizer, and per‑microbatch CUDA Graph scheduling—to the GR00T N1.6 Vision‑Language‑Action model, delivering up to 2.3× higher training throughput and a 56.6% reduction in overall training time on an 8×A800 cluster.

CUDA GraphGR00T N1.6LoongForge
0 likes · 14 min read
Halving Training Time: LoongForge Full‑Stack Optimizations Boost GR00T N1.6 Throughput 2.3×
Woodpecker Software Testing
Woodpecker Software Testing
Jun 1, 2026 · Artificial Intelligence

Adversarial Testing Performance Optimization: Practical Strategies for Test Engineers

The article analyzes why adversarial testing is slow—highlighting redundant PGD steps, full model re‑execution, and serial verification—and presents a four‑stage optimization framework (intelligent termination, hierarchical reuse, parallel orchestration, feedback‑driven iteration) that dramatically speeds testing and enables CI/CD integration.

AI robustnessCI/CDPGD
0 likes · 8 min read
Adversarial Testing Performance Optimization: Practical Strategies for Test Engineers
Baidu Geek Talk
Baidu Geek Talk
Jun 1, 2026 · Cloud Computing

Cut Migration Time by 60%: How Baidu Cloud Scaled Intel Xeon 6 QAT‑Accelerated VM Live Migration

VM live migration in large cloud clusters suffers from high CPU load and long downtime; Baidu Cloud integrated Intel Xeon 6 processors with built‑in QuickAssist Technology to offload memory compression, achieving up to 60% reduction in migration duration, 20% lower CPU usage, and sub‑10 ms pause windows.

CPU offloadCloud ComputingIntel QAT
0 likes · 10 min read
Cut Migration Time by 60%: How Baidu Cloud Scaled Intel Xeon 6 QAT‑Accelerated VM Live Migration
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Jun 1, 2026 · Cloud Computing

Cut Migration Time by 60%: Baidu Cloud Deploys Intel Xeon 6 QAT‑Accelerated Live VM Migration

The article analyzes the challenges of large‑scale live VM migration, introduces Intel Xeon 6 CPU‑integrated QAT hardware acceleration, compares pre‑ and post‑QAT workflows, and reports a 60% reduction in migration time, 20% CPU savings, and sub‑10 ms downtime in Baidu Smart Cloud production.

Cloud ComputingIntel QATPerformance Optimization
0 likes · 10 min read
Cut Migration Time by 60%: Baidu Cloud Deploys Intel Xeon 6 QAT‑Accelerated Live VM Migration
dbaplus Community
dbaplus Community
May 26, 2026 · Fundamentals

Can't Master the Linux Kernel Without Understanding NUMA?

This article explains the core principles of NUMA architecture, how it is deeply integrated into Linux kernel memory management, process scheduling, and system calls, and provides practical commands and real‑world examples to diagnose and optimize NUMA‑related performance issues.

Linux kernelMemory ManagementPerformance Optimization
0 likes · 24 min read
Can't Master the Linux Kernel Without Understanding NUMA?
Architects' Tech Alliance
Architects' Tech Alliance
May 26, 2026 · Information Security

How Sugon Cloud’s “3D Secure Computation” Delivers Seamless Security for Financial Institutions

Facing the 2025‑2026 regulatory deadline, Sichuan Rural Commercial Union Bank migrated its core services to Sugon Cloud’s “3D Secure Computation” platform, achieving full‑link encryption with only a 4.4% performance overhead and proving that hardware‑based security can be both compliant and virtually invisible to users.

Performance OptimizationSugon Cloudcloud security
0 likes · 5 min read
How Sugon Cloud’s “3D Secure Computation” Delivers Seamless Security for Financial Institutions
Machine Heart
Machine Heart
May 21, 2026 · Industry Insights

How ZCube Redefines 20‑Year‑Old Networking Logic to Boost GPU Throughput by 15%

ZCube, a new flat networking architecture deployed by Zhipu in its GLM‑5.1 inference cluster, eliminates structural congestion, delivering a 15% throughput gain, 40.6% latency reduction, and one‑third lower hardware cost without adding GPUs, signaling a shift from raw compute to system efficiency in AI infrastructure.

AI networkingGPU ClusterMRC protocol
0 likes · 15 min read
How ZCube Redefines 20‑Year‑Old Networking Logic to Boost GPU Throughput by 15%
IT Services Circle
IT Services Circle
May 20, 2026 · Databases

Why Can Redis Sustain Over 100k QPS? A Deep Technical Dive

The article explains how Redis achieves more than 100,000 queries per second by leveraging in‑memory storage, highly optimized data structures, a single‑threaded core with epoll‑based I/O multiplexing, optional I/O multithreading, and performance tricks such as pipelining and careful key sizing.

Data StructuresI/O multiplexingIn-Memory Database
0 likes · 9 min read
Why Can Redis Sustain Over 100k QPS? A Deep Technical Dive
Spring Full-Stack Practical Cases
Spring Full-Stack Practical Cases
May 17, 2026 · Backend Development

How to Accurately Measure Spring Boot Bean Creation Time for Performance Optimization

This article demonstrates a non‑intrusive, high‑precision method to track each Spring Boot bean's creation time by customizing ApplicationContextFactory, Environment, and BeanFactory, allowing package‑level exclusion and clear console reporting to pinpoint slow‑loading beans during startup.

Performance OptimizationSpring Bootapplicationcontextfactory
0 likes · 7 min read
How to Accurately Measure Spring Boot Bean Creation Time for Performance Optimization
MaGe Linux Operations
MaGe Linux Operations
May 16, 2026 · Operations

How to Cut Nginx Response Time from 500 ms to 50 ms: A Practical Optimization Guide

By establishing baselines, methodically profiling logs, and applying layered tweaks—such as keepalive connections, gzip compression, proxy caching, worker tuning, HTTP/2, kernel parameters, and backend caching—this guide demonstrates how to reduce Nginx’s total response time from 500 ms to under 50 ms with measurable results.

HTTP/2Linux TuningNGINX
0 likes · 25 min read
How to Cut Nginx Response Time from 500 ms to 50 ms: A Practical Optimization Guide
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
May 15, 2026 · Artificial Intelligence

How PD (Prefill‑Decode) Disaggregation Makes LLM Inference Faster and More Stable

The article explains PD (Prefill‑Decode) disaggregation, an architecture that separates the compute‑bound Prefill stage from the memory‑bound Decode stage onto different GPU pools, eliminating interference, enabling independent scaling, leveraging hardware specialization, and delivering up to 85% lower tail latency for large language model inference.

GPU scalingKV cache transportLLM Inference
0 likes · 10 min read
How PD (Prefill‑Decode) Disaggregation Makes LLM Inference Faster and More Stable
vivo Internet Technology
vivo Internet Technology
May 13, 2026 · Big Data

How Vivo Upgraded a Million‑Node YARN Cluster: Architecture, Scheduler Switch, and Performance Optimizations

This article details Vivo's end‑to‑end upgrade of a YARN 2.6.0 cluster to a modern version for a million‑node, hundred‑thousand‑tasks‑per‑day platform, covering architectural evolution, scheduler migration, compatibility fixes, performance tuning, and service‑continuity strategies.

Big DataCapacity SchedulerHadoop
0 likes · 28 min read
How Vivo Upgraded a Million‑Node YARN Cluster: Architecture, Scheduler Switch, and Performance Optimizations
Baidu Geek Talk
Baidu Geek Talk
May 13, 2026 · Artificial Intelligence

LoongForge Boosts Multimodal Training Speed by 45% on GPU and Kunlun XPU

LoongForge, Baidu Baige’s open‑source full‑modal training framework, unifies LLM, VLM and VLA workloads, runs unchanged on NVIDIA GPUs and Kunlun XPU, and delivers 15‑45% end‑to‑end speedups with up to 90% linear scaling on 5,000‑plus card clusters, while simplifying model integration via YAML.

AI InfrastructureGPUKunlun XPU
0 likes · 23 min read
LoongForge Boosts Multimodal Training Speed by 45% on GPU and Kunlun XPU
Woodpecker Software Testing
Woodpecker Software Testing
May 12, 2026 · Operations

How AI Cut CI/CD Build Time from 12 Minutes to 98 Seconds in a FinTech Team

A FinTech team's CI pipeline saw build time jump to 12 minutes 37 seconds and test failures rise to 18%, but after deploying a lightweight AI analysis engine the hidden JUnit parameterized test caused resource contention was identified, prioritized fixes were generated, and overall build duration was reduced to under two minutes.

AICI/CDPerformance Optimization
0 likes · 9 min read
How AI Cut CI/CD Build Time from 12 Minutes to 98 Seconds in a FinTech Team
Deepin Linux
Deepin Linux
May 11, 2026 · Fundamentals

Eliminate Memory Fragmentation: Understanding Memory Pools

The article explains how frequent dynamic allocations cause external and internal memory fragmentation, illustrates the problem with C++ examples, and shows that pre‑allocating a large contiguous block as a memory pool—managed via block division, free‑list tracking, and thread‑safe operations—significantly reduces fragmentation, improves allocation speed, and boosts concurrency performance.

C#Memory FragmentationPerformance Optimization
0 likes · 30 min read
Eliminate Memory Fragmentation: Understanding Memory Pools
TonyBai
TonyBai
May 11, 2026 · Backend Development

Why Go Builds and Rust Optimizes: The Only Viable Backend Strategy for 2026

The article argues that modern backend systems inevitably hit a scalability wall, and the most effective way to cross it is to use Go for fast, simple service orchestration while delegating performance‑critical, resource‑intensive components to Rust, combining both languages to balance development speed, cost, and reliability.

Cloud Cost ManagementGoMicroservices
0 likes · 10 min read
Why Go Builds and Rust Optimizes: The Only Viable Backend Strategy for 2026
DataFunSummit
DataFunSummit
May 10, 2026 · Artificial Intelligence

Why Memory Is the Bottleneck for AI Agents and How MemOS Overcomes It

The article analyzes the critical role of memory in AI agents, compares model‑driven and application‑driven approaches, details the five‑layer MemOS architecture with three‑level memory coordination, and presents performance gains such as 100‑200% monthly cloud‑service growth, up to 72% token savings, and a 30% improvement in answer quality.

AI AgentEnterprise AILLM
0 likes · 18 min read
Why Memory Is the Bottleneck for AI Agents and How MemOS Overcomes It
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
May 9, 2026 · Cloud Computing

Optimizing OpenStack Nova VM Port Attach/Detach Performance

The article analyzes why Nova VM port attach and detach operations can take 70‑90 seconds due to full‑cache refresh and coarse locks, proposes incremental cache updates and fine‑grained port locks, and shows benchmark results that cut attach time to 10‑17 s and detach time to about 7 s while eliminating lock contention and consistency errors.

Attach/DetachIncremental CacheLock Granularity
0 likes · 10 min read
Optimizing OpenStack Nova VM Port Attach/Detach Performance
ByteDance SE Lab
ByteDance SE Lab
May 8, 2026 · Mobile Development

Douyin’s Dynamic Performance Framework: Design, Perception, and Optimization Practices

The article details Douyin's Dynamic Performance Framework (DDPF), covering its evolution from static resource scheduling to a multi‑dimensional signal‑driven system, the perception and decision layers including low‑interaction detection and end‑side intelligence, and concrete VM tuning cases that illustrate how dynamic optimization is achieved on Android.

AndroidDouyinDynamic Performance
0 likes · 21 min read
Douyin’s Dynamic Performance Framework: Design, Perception, and Optimization Practices
Woodpecker Software Testing
Woodpecker Software Testing
May 8, 2026 · Artificial Intelligence

Beyond More Hardware: In‑Depth Strategies to Accelerate AI Safety Testing

The article dissects AI safety testing bottlenecks and presents four optimization dimensions—testing paradigm, data generation, execution architecture, and feedback loop—offering concrete techniques such as risk‑aware input filtering, gradient‑cache reuse, heterogeneous parallelism, and adaptive sampling that together cut testing time by several folds.

AI safety testingAdaptive SamplingPerformance Optimization
0 likes · 8 min read
Beyond More Hardware: In‑Depth Strategies to Accelerate AI Safety Testing
Machine Heart
Machine Heart
May 7, 2026 · Artificial Intelligence

Nvidia Endorses TokenSpeed: A Light‑Speed Agent Inference Engine Built in Two Months

TokenSpeed, an open‑source LLM inference engine designed for agent workloads, delivers TensorRT‑LLM‑level performance and vLLM‑level ease of use, outperforms TensorRT‑LLM by up to 11% throughput and halves latency on speculative decoding, and has earned Nvidia’s public recommendation.

Agent workloadsLLM InferenceNVIDIA Blackwell
0 likes · 8 min read
Nvidia Endorses TokenSpeed: A Light‑Speed Agent Inference Engine Built in Two Months
iQIYI Technical Product Team
iQIYI Technical Product Team
May 7, 2026 · Mobile Development

How iQIYI Cut Memory Peaks by 60% and Boost Animated Image Loading by 75% with Cangjie on HarmonyOS

iQIYI built a high‑performance image library for HarmonyOS using Huawei's Cangjie language, replacing ArkTS bottlenecks, adding AVIF support and a three‑level cache, and achieved over 60% reduction in memory peak usage and up to 75% faster animated‑image loading, as demonstrated by detailed benchmarks and architectural analysis.

AVIFCangjieHarmonyOS
0 likes · 13 min read
How iQIYI Cut Memory Peaks by 60% and Boost Animated Image Loading by 75% with Cangjie on HarmonyOS
dbaplus Community
dbaplus Community
May 5, 2026 · Artificial Intelligence

How Claude Transforms SQL Workloads in the Dewu App Data Warehouse

The article examines Claude Code's deep integration into Dewu's e‑commerce data warehouse, outlining a decoupled cognitive‑runtime architecture, standardized I/O contracts, concrete performance gains across tagging, modeling, reporting and testing, and a comprehensive risk‑governance framework.

AI Agentic WorkflowCode LLMData Warehouse
0 likes · 23 min read
How Claude Transforms SQL Workloads in the Dewu App Data Warehouse
Old Zhang's AI Learning
Old Zhang's AI Learning
May 5, 2026 · Artificial Intelligence

vLLM 0.20.1 Fixes Instability and Speed Issues for DeepSeek V4

The vLLM 0.20.1 patch, released shortly after 0.20.0, consolidates stability fixes and performance optimizations for DeepSeek V4, adds several bug fixes, updates installation instructions, and provides targeted upgrade recommendations for different user scenarios.

Bug FixDeepSeek-V4GPU inference
0 likes · 9 min read
vLLM 0.20.1 Fixes Instability and Speed Issues for DeepSeek V4
SpringMeng
SpringMeng
May 2, 2026 · Artificial Intelligence

10 Essential AI Prompt Templates Every Programmer Needs

This article presents ten practical AI prompt templates that help programmers efficiently handle requirement clarification, unit test generation, code explanation, refactoring, exception troubleshooting, performance tuning, SQL creation, knowledge documentation, design review, and cross‑language translation, each illustrated with concrete examples and usage tips.

AI promptingBackend DevelopmentPerformance Optimization
0 likes · 13 min read
10 Essential AI Prompt Templates Every Programmer Needs
Architects' Tech Alliance
Architects' Tech Alliance
May 2, 2026 · Artificial Intelligence

Eight Chinese AI Chips Achieve Day‑Zero DeepSeek‑V4 Compatibility

The article explains how eight domestic AI chip makers—Huawei Ascend, Cambricon, HaiGuang, Moore Threads, Kunlun, Pingtouge, Muxi, and Tianshu—simultaneously completed full‑link compatibility, performance tuning, and stability verification for DeepSeek‑V4 on release day, detailing each vendor’s technical path, shared ecosystem breakthroughs, and the broader impact on the AI industry.

AI chipsDay0 adaptationDeepSeek-V4
0 likes · 11 min read
Eight Chinese AI Chips Achieve Day‑Zero DeepSeek‑V4 Compatibility
Wukong Talks Architecture
Wukong Talks Architecture
May 1, 2026 · Databases

How We Monitored and Optimized Databases During a New‑Old System Switch (Part 1)

During a high‑traffic migration where QPS peaked over 10,000, the team used DBDoctor to perform full‑stack database monitoring, pinpoint long‑running transactions and slow SQL, apply index recommendations, and achieve cost reductions of up to 246 000 times, demonstrating rapid, data‑driven performance optimization.

DBDoctorDatabase MonitoringIndex Recommendation
0 likes · 9 min read
How We Monitored and Optimized Databases During a New‑Old System Switch (Part 1)
IT Services Circle
IT Services Circle
May 1, 2026 · Artificial Intelligence

10 Essential AI Prompt Templates Every Programmer Should Use

The article presents ten practical AI prompt templates that cover the full software development workflow—from requirement clarification and code generation to testing, refactoring, debugging, performance tuning, SQL optimization, documentation, design review, and cross‑language translation—helping developers get accurate, production‑ready results from AI.

AI promptingJavaPerformance Optimization
0 likes · 12 min read
10 Essential AI Prompt Templates Every Programmer Should Use
Woodpecker Software Testing
Woodpecker Software Testing
Apr 29, 2026 · Artificial Intelligence

Adversarial Testing Performance Optimization: A Practical Guide for Test Experts

As AI deployments accelerate, the article explains why adversarial testing is inherently slow, identifies three coupling bottlenecks, and presents a four‑stage, data‑driven optimization framework that boosts throughput by up to 3.2× while preserving robustness, backed by real‑world financial‑AI case studies.

AI robustnessPerformance Optimizationadversarial cache
0 likes · 7 min read
Adversarial Testing Performance Optimization: A Practical Guide for Test Experts
James' Growth Diary
James' Growth Diary
Apr 26, 2026 · Backend Development

How Claude Code Achieves Sub‑Second Cold Starts with Lazy Loading and Compile‑Time Feature Gating

The article dissects Claude Code's sub‑second cold‑start performance by detailing its lazy‑loading mechanism, compile‑time feature‑gate (DCE) via bun:bundle, runtime gating with GrowthBook, and the engineering trade‑offs of managing over 88 feature flags in a single‑file CLI bundle.

BunCLIPerformance Optimization
0 likes · 16 min read
How Claude Code Achieves Sub‑Second Cold Starts with Lazy Loading and Compile‑Time Feature Gating
dbaplus Community
dbaplus Community
Apr 26, 2026 · Operations

Why the Lsof Command Is an Underrated Lifesaver in Production

The article explains how the Linux lsof utility can quickly identify port conflicts, lingering deleted files, and file‑handle leaks, offering practical commands, real‑world case studies, advanced options, performance tips, and integration techniques for effective system troubleshooting.

LinuxPerformance Optimizationfile handles
0 likes · 12 min read
Why the Lsof Command Is an Underrated Lifesaver in Production
Shi's AI Notes
Shi's AI Notes
Apr 24, 2026 · Backend Development

How OpenAI’s Responses API WebSocket Revamp Accelerates Agent Workflows by 40%

OpenAI identified API‑overhead as the new bottleneck after faster model inference and introduced a persistent WebSocket connection that caches conversation state, overlaps request phases, and preserves the original API shape, delivering up to a 40% end‑to‑end latency reduction and dramatically higher TPS.

OpenAIPerformance OptimizationResponses API
0 likes · 11 min read
How OpenAI’s Responses API WebSocket Revamp Accelerates Agent Workflows by 40%
Machine Heart
Machine Heart
Apr 24, 2026 · Artificial Intelligence

Cambricon Achieves Day‑0 Native Support for DeepSeek‑V4, Uniting Two Chinese AI Leaders

Cambricon leveraged its NeuWare stack and vLLM framework to deliver Day‑0 native support for DeepSeek‑V4‑flash (285 B) and DeepSeek‑V4‑pro (1.6 T), open‑sourcing the adaptation and showcasing rapid model migration alongside extreme performance optimizations across software and hardware layers.

AI inferenceCambriconDeepSeek-V4
0 likes · 5 min read
Cambricon Achieves Day‑0 Native Support for DeepSeek‑V4, Uniting Two Chinese AI Leaders
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Apr 24, 2026 · Artificial Intelligence

LoongForge: Open‑Source Multimodal Training Framework Runs on GPU and Kunlun XPU with 45% Speedup

LoongForge is an open‑source, Megatron‑based multimodal training framework that unifies LLM, VLM, VLA and diffusion models, runs seamlessly on NVIDIA GPUs and Baidu Kunlun XPU, and delivers 15%‑45% end‑to‑end training acceleration while scaling linearly on thousands of cards.

GPUKunlun XPULoongForge
0 likes · 23 min read
LoongForge: Open‑Source Multimodal Training Framework Runs on GPU and Kunlun XPU with 45% Speedup
Java Architect Handbook
Java Architect Handbook
Apr 22, 2026 · Backend Development

How Changing Five Lines of Code Boosted API Throughput Over 10×

A low‑traffic B2B service struggled to meet a 500 req/s demand, achieving only 50 req/s with high CPU usage; through systematic profiling, lock analysis, async refactoring, thread‑pool tuning, and eliminating costly Spring bean creation, the team dramatically improved response times and throughput, revealing deeper CPU‑usage mysteries.

JavaPerformance OptimizationProfiling
0 likes · 16 min read
How Changing Five Lines of Code Boosted API Throughput Over 10×
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 20, 2026 · Cloud Computing

How Alibaba Cloud’s Agentic Search Redefines Enterprise AI Search

The article analyzes Alibaba Cloud Elasticsearch’s shift from keyword‑based to Agent‑native search, detailing the Agent Native architecture, hybrid retrieval 2.0, FalconSeek engine performance gains of up to 300%, cost reductions of 40‑70%, and the ecosystem of ES Skills, cloud‑native enhancements, and observability that together enable a scalable AI search platform for enterprises.

AI SearchAgentic ArchitectureCloud Computing
0 likes · 13 min read
How Alibaba Cloud’s Agentic Search Redefines Enterprise AI Search
Coder Trainee
Coder Trainee
Apr 19, 2026 · Backend Development

How to Optimize Performance and Deploy a Production‑Ready Blog System

This article walks through a complete performance‑optimization and deployment pipeline for a Spring Boot blog, covering multi‑level caching with Caffeine and Redis, database indexing and cursor pagination, read‑write splitting, asynchronous processing, rate limiting, Docker multi‑stage builds, Nginx reverse‑proxy setup, Actuator monitoring, custom metrics, health checks, alerting, JMeter load testing, and JVM tuning.

CaffeineDockerPerformance Optimization
0 likes · 17 min read
How to Optimize Performance and Deploy a Production‑Ready Blog System
Woodpecker Software Testing
Woodpecker Software Testing
Apr 18, 2026 · Operations

Deep Dive into Performance Optimization for Self‑Healing Test Scripts

The article examines why self‑healing test scripts increase runtime overhead, breaks down the underlying mechanisms, and presents four concrete optimization tactics—layered healing, locator caching, visual/semantic throttling, and asynchronous repair—backed by real‑world case data showing up to 43% faster regressions and 52% lower maintenance cost.

CI/CDPerformance OptimizationUI testing
0 likes · 8 min read
Deep Dive into Performance Optimization for Self‑Healing Test Scripts
Deepin Linux
Deepin Linux
Apr 18, 2026 · Fundamentals

Mastering Process Context Switching: What the CPU Actually Does

This article breaks down the fundamentals of process context switching, explaining CPU registers, program counters, the three-step switch routine, trigger conditions, performance impact, monitoring tools, and practical optimization techniques to help interview candidates answer confidently.

LinuxPerformance OptimizationProcess Scheduling
0 likes · 29 min read
Mastering Process Context Switching: What the CPU Actually Does
ByteDance SE Lab
ByteDance SE Lab
Apr 17, 2026 · Industry Insights

How DisCoGC Cuts Storage Costs by 20%: A Deep Dive into ByteStore’s New GC Paradigm

This article analyzes the DisCoGC algorithm introduced by ByteDance, explaining how its discard‑centric garbage collection eliminates the write‑amplification vs. space‑amplification trade‑off in log‑structured storage, details the engineering challenges of multi‑layer deployment, and presents production results showing up to 20% TCO reduction without impacting latency.

CompactionDistributed storageGarbage Collection
0 likes · 19 min read
How DisCoGC Cuts Storage Costs by 20%: A Deep Dive into ByteStore’s New GC Paradigm
JD Tech
JD Tech
Apr 16, 2026 · Industry Insights

How JD Revolutionized Coupon Search with a Stream‑Batch Unified Architecture

This article analyzes JD's end‑to‑end upgrade of its retail coupon search infrastructure, detailing the business drivers, data‑skew challenges, the shift from dual KV and batch pipelines to a unified stream‑batch model built on Apache Doris, and the resulting performance, resource and stability gains across multiple scenarios.

Apache DorisBatch ProcessingCoupon Search
0 likes · 12 min read
How JD Revolutionized Coupon Search with a Stream‑Batch Unified Architecture
Ctrip Technology
Ctrip Technology
Apr 16, 2026 · Big Data

How Ray + DuckDB Cut 9B-Row Attribution Queries from 40s to 15s

When attribution analysis on over 900 million rows slowed to more than 40 seconds and threatened cluster stability, Ctrip's smart attribution team rebuilt the architecture with Ray and DuckDB, achieving sub‑15‑second query times, 160 % performance gain, and complete resource isolation.

Attribution AnalysisBig DataDistributed Computing
0 likes · 22 min read
How Ray + DuckDB Cut 9B-Row Attribution Queries from 40s to 15s
DataFunTalk
DataFunTalk
Apr 16, 2026 · Big Data

How Xiaohongshu Cut Data Architecture Costs by Two‑Thirds with Incremental Computing

This article details Xiaohongshu's data platform evolution from a simple ClickHouse‑based ad‑hoc system to a Lambda‑style architecture and finally a lakehouse solution, highlighting how the adoption of a new incremental computing model reduced architectural complexity, resource consumption and development effort each to roughly one‑third while delivering sub‑second query performance on petabyte‑scale data.

Big DataData ArchitectureLakehouse
0 likes · 21 min read
How Xiaohongshu Cut Data Architecture Costs by Two‑Thirds with Incremental Computing
Architect Chen
Architect Chen
Apr 16, 2026 · Big Data

Supercharge Kafka Consumer Performance: Parallelism, Batching, and Multithreading

This guide explains practical techniques to dramatically increase Kafka consumer throughput, including scaling consumer instances or partitions, tuning fetch and poll parameters, and implementing a multithreaded consumer model, while also covering hardware, JVM, and OS optimizations and monitoring recommendations.

Batch FetchConsumer ParallelismMonitoring
0 likes · 5 min read
Supercharge Kafka Consumer Performance: Parallelism, Batching, and Multithreading
Qborfy AI
Qborfy AI
Apr 16, 2026 · Artificial Intelligence

How Trace Analysis Turns AI Agents from Black Boxes into Optimized Systems

Trace analysis converts the opaque decision‑making of AI agents into observable data, enabling systematic collection, parallel error detection, targeted improvements, and iterative experimentation, while revealing common failure patterns, budgeting trade‑offs, over‑fitting risks, and cost‑optimization opportunities through a reusable Trace Analyzer Skill framework.

AILLMObservability
0 likes · 33 min read
How Trace Analysis Turns AI Agents from Black Boxes into Optimized Systems
Woodpecker Software Testing
Woodpecker Software Testing
Apr 15, 2026 · Artificial Intelligence

How AI Testing Tools Redefine Performance Optimization: A New Paradigm

Amid exploding large‑model deployments, AI teams struggle with slow test feedback, but AI‑native testing tools—through intelligent load modeling, inference‑layer root‑cause analysis, and self‑healing loops—demonstrate concrete latency reductions, resource savings, and faster issue remediation.

AI testingMLOpsObservability
0 likes · 6 min read
How AI Testing Tools Redefine Performance Optimization: A New Paradigm
Java Web Project
Java Web Project
Apr 15, 2026 · Backend Development

How We Cut Spring Boot Startup from 12 s to 3 s with GraalVM Native Image

This article walks through converting a Spring Boot order‑query microservice to a GraalVM Native Image, detailing environment setup, common build pitfalls with concrete code fixes, Docker multi‑stage packaging, K8s scaling comparison, performance benchmarks, CI/CD integration, and guidance on when Native Image is appropriate.

CI/CDDockerGraalVM
0 likes · 12 min read
How We Cut Spring Boot Startup from 12 s to 3 s with GraalVM Native Image
Tencent Technical Engineering
Tencent Technical Engineering
Apr 12, 2026 · Operations

How TencentOS Engineers Revamped Linux Swap for 5‑20% Performance Gains

This article translates and consolidates three LWN analyses of the Linux swap subsystem modernization led by TencentOS kernel engineer Kairui Song, detailing the introduction of swap tables, removal of the swap map, virtual swap concepts, code changes, performance improvements of up to 20 % and the broader impact on the kernel community.

Linux kernelMemory ManagementPerformance Optimization
0 likes · 27 min read
How TencentOS Engineers Revamped Linux Swap for 5‑20% Performance Gains
Deepin Linux
Deepin Linux
Apr 12, 2026 · Fundamentals

Why TLB Matters: Unlocking Linux Kernel Performance

This article explains the role of the Translation Lookaside Buffer (TLB) in Linux virtual‑memory translation, covering basic address concepts, page‑table mechanics, TLB operation, flush and synchronization strategies, hardware vs software management, Linux kernel APIs, and a practical C benchmark comparing sequential and random memory accesses.

CacheOperating SystemsPerformance Optimization
0 likes · 36 min read
Why TLB Matters: Unlocking Linux Kernel Performance
Old Zhang's AI Learning
Old Zhang's AI Learning
Apr 11, 2026 · Artificial Intelligence

Mastering SGLang: KV Cache and RadixAttention for Faster LLM Inference

This article reviews the DeepLearning.ai short course on SGLang, explains why large‑language‑model inference is slow, details how KV Cache reduces the computation from O(n²) to O(n), introduces RadixAttention for cross‑request caching, and presents code examples and benchmark results showing up to 10× speedup in real‑world RAG scenarios.

KV cacheLLM InferencePerformance Optimization
0 likes · 13 min read
Mastering SGLang: KV Cache and RadixAttention for Faster LLM Inference
ITPUB
ITPUB
Apr 10, 2026 · Backend Development

How a Simple Refactor and Parallelism Cut Java Loop Time from 26s to 0.7s

A new team member transformed a painfully slow Java data‑processing routine—originally taking 26,856 ms—by refactoring nested loops, extracting repeated calculations, and introducing a thread‑pool for parallel execution, reducing runtime to just 748 ms, and the article walks through the before‑and‑after code and key techniques.

JavaPerformance Optimizationparallel computing
0 likes · 8 min read
How a Simple Refactor and Parallelism Cut Java Loop Time from 26s to 0.7s
Woodpecker Software Testing
Woodpecker Software Testing
Apr 10, 2026 · Operations

How Adversarial Testing Drives Hidden Performance Gains

Adversarial testing transforms performance optimization by injecting extreme, realistic failures—such as cache avalanches, CDN outages, or slow SQL—to expose fragile boundaries, tighten observability, and create a rapid, evidence‑driven feedback loop that prevents costly production incidents.

MicroservicesObservabilityPerformance Optimization
0 likes · 8 min read
How Adversarial Testing Drives Hidden Performance Gains
DataFunTalk
DataFunTalk
Apr 10, 2026 · Big Data

How Xiaohongshu Cut Data Architecture Costs by Two‑Thirds with Incremental Computing

This article analyzes Xiaohongshu's data platform evolution—from a simple ClickHouse‑based analytics layer to a Lambda architecture and finally a lakehouse design—highlighting how adopting a new incremental computing model reduced architecture complexity, resource consumption, and development effort each to roughly one‑third while delivering sub‑second query performance on petabyte‑scale data.

Big DataData ArchitectureLakehouse
0 likes · 22 min read
How Xiaohongshu Cut Data Architecture Costs by Two‑Thirds with Incremental Computing
Black & White Path
Black & White Path
Apr 8, 2026 · Artificial Intelligence

Run Massive AI Models on a Single PC: The 1‑Bit LLM Revolution

Microsoft’s open‑source bitnet.cpp transforms 100‑billion‑parameter LLM inference from GPU‑only to ordinary CPUs by replacing floating‑point matrix multiplication with integer add‑subtract, cutting energy use by 82 %, memory by 90 % and delivering up to 6× speed on x86/ARM hardware.

1-bit LLMBitNetCPU inference
0 likes · 7 min read
Run Massive AI Models on a Single PC: The 1‑Bit LLM Revolution
JavaGuide
JavaGuide
Apr 7, 2026 · Information Security

Why Brute‑Force Won’t Cut It for Sensitive‑Word Filtering (And What Actually Works)

The article walks through the evolution of sensitive‑word filtering—from naïve brute‑force scanning to Trie, Aho‑Corasick automaton, Double‑Array Trie, and DFA implementations—detailing their algorithms, time/space complexities, concrete Java code examples, performance trade‑offs, high‑concurrency optimizations, and practical production advice for building a robust content‑moderation system.

Aho-CorasickDFAJava
0 likes · 26 min read
Why Brute‑Force Won’t Cut It for Sensitive‑Word Filtering (And What Actually Works)
James' Growth Diary
James' Growth Diary
Apr 6, 2026 · Artificial Intelligence

10 Practical LangChain Performance Hacks to Speed Up and Cut Costs

This article presents ten concrete techniques—including in‑memory and Redis caching, semantic caching, parallel execution, batch processing, prompt compression, model routing, streaming output, and connection‑pool reuse—to dramatically reduce latency and token costs in production LangChain applications.

CachingLangChainNode.js
0 likes · 14 min read
10 Practical LangChain Performance Hacks to Speed Up and Cut Costs
ITPUB
ITPUB
Apr 2, 2026 · Operations

Why Your SSD Slows Down Over Time and How to Fix It on Linux

This guide explains the reasons behind SSD performance degradation, such as write‑amplification and garbage collection, and provides practical Linux techniques—including enabling TRIM, maintaining free space, reducing unnecessary writes, and using smartctl—to restore and preserve SSD speed.

LinuxPerformance OptimizationSSD
0 likes · 6 min read
Why Your SSD Slows Down Over Time and How to Fix It on Linux
Tencent Architect
Tencent Architect
Apr 2, 2026 · Operations

How Modernizing Linux Swap Boosts Performance and Cuts Memory Overhead

This article translates and consolidates Jonathan Corbet’s three-part “Modernizing swapping” series, detailing the introduction of swap tables, removal of swap maps, and virtual swap concepts that together improve Linux kernel swap performance by up to 20%, reduce metadata memory by up to 30%, and simplify the codebase.

Linux kernelPerformance Optimizationswap map
0 likes · 27 min read
How Modernizing Linux Swap Boosts Performance and Cuts Memory Overhead
TonyBai
TonyBai
Apr 1, 2026 · Backend Development

How a $400 AI‑Driven Rewrite of JSONata Saved $500K in Kubernetes Costs

Using AI agents, an engineer rewrote the JavaScript‑based JSONata engine in Go within a day for $400 in token fees, cutting a Kubernetes‑hosted service’s annual cost from $500,000 to zero and delivering up to 1,500× performance gains, while outlining the step‑by‑step AI‑driven refactoring process.

AI-driven refactoringJSONataMicroservices
0 likes · 10 min read
How a $400 AI‑Driven Rewrite of JSONata Saved $500K in Kubernetes Costs
AI Architecture Path
AI Architecture Path
Apr 1, 2026 · Frontend Development

How Pretext Eliminates DOM Reflows for Ultra‑Fast Text Measurement

Pretext, a zero‑DOM, high‑performance text measurement engine created by React core contributor chenglou, uses Canvas‑based calculations and a two‑stage prepare/layout workflow to avoid layout reflows, delivering up to 500× speed gains for virtual scrolling, rich‑text rendering, and AI‑driven UI layout predictions.

Performance OptimizationPretexttext measurement
0 likes · 7 min read
How Pretext Eliminates DOM Reflows for Ultra‑Fast Text Measurement
Ubuntu
Ubuntu
Mar 31, 2026 · Operations

Master Systemd Service Management: From Basics to Advanced Linux Skills

This comprehensive guide walks you through Systemd fundamentals, core systemctl commands, unit file anatomy, custom service creation, common troubleshooting, performance tuning, timer and socket activation, and best‑practice security hardening for Linux administrators.

LinuxPerformance Optimizationservice management
0 likes · 18 min read
Master Systemd Service Management: From Basics to Advanced Linux Skills
Deepin Linux
Deepin Linux
Mar 28, 2026 · Fundamentals

Unlocking Linux Performance: A Deep Dive into NUMA Architecture

This article explains the core principles of NUMA, its deep integration with the Linux kernel, practical memory‑node and scheduling mechanisms, real‑world database and virtualization use cases, and step‑by‑step commands for inspecting and tuning NUMA on modern servers.

Linux kernelMemory ManagementPerformance Optimization
0 likes · 23 min read
Unlocking Linux Performance: A Deep Dive into NUMA Architecture
vivo Internet Technology
vivo Internet Technology
Mar 25, 2026 · Industry Insights

How Vivo Scaled Marketing Automation with Presto, Bitmap, and StarRocks

This case study details how Vivo’s marketing automation platform evolved its data‑driven architecture—from a Presto‑based wide‑table design, through a Bitmap optimization, to a StarRocks migration—addressing performance bottlenecks, reducing resource costs, and enhancing data security.

Big DataData ArchitectureOLAP
0 likes · 11 min read
How Vivo Scaled Marketing Automation with Presto, Bitmap, and StarRocks
Top Architect
Top Architect
Mar 25, 2026 · Backend Development

Boost API Performance 10× with a Three‑Tier Cache Pyramid in Spring Boot 3

This article explains how to design and implement a three‑level cache pyramid (Caffeine → Redis → MySQL) in Spring Boot 3, covering configuration, a reusable CacheTemplate, hot‑key handling, random TTL, warm‑up, monitoring, and load‑test results that show latency dropping from tens of milliseconds to a few milliseconds while cutting CPU and network usage dramatically.

Backend DevelopmentCachingCaffeine
0 likes · 11 min read
Boost API Performance 10× with a Three‑Tier Cache Pyramid in Spring Boot 3
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 25, 2026 · Databases

How AliSQL AI Diagnoses and Eliminates MySQL Replication Lag

This article analyzes the severe replication‑delay issues in MySQL master‑slave setups, identifies four typical workload patterns that cause lag, demonstrates how AliSQL's AI assistant pinpoints the root causes, and explains the kernel‑level optimizations that completely remove the delay.

AI DiagnosisAliSQLPerformance Optimization
0 likes · 13 min read
How AliSQL AI Diagnoses and Eliminates MySQL Replication Lag
AI Explorer
AI Explorer
Mar 23, 2026 · Artificial Intelligence

How Unsloth Studio Turns Local AI Training into a Simple, High‑Performance Experience

Unsloth Studio, an open‑source local AI studio, combines a sleek web UI with a custom Triton kernel that claims up to 2× faster training, 70% VRAM savings (80% for RL), supports over 500 models, visual data‑recipe workflows, and both desktop and Python library usage for developers, researchers, and hobbyists.

AI StudioModel TrainingPerformance Optimization
0 likes · 7 min read
How Unsloth Studio Turns Local AI Training into a Simple, High‑Performance Experience
Baidu Geek Talk
Baidu Geek Talk
Mar 23, 2026 · Databases

How Baidu’s MEG Platform Revamped ClickHouse with a Lakehouse Architecture

This article analyzes the challenges of scaling ClickHouse within Baidu’s MEG data platform and details a lake‑house solution that decouples storage and compute, integrates a meta‑service for transparent data access, optimizes query performance through caching, data roll‑up and layout tuning, and introduces a unified query gateway that gracefully falls back to Spark for complex workloads.

ClickHouseData PlatformLakehouse
0 likes · 25 min read
How Baidu’s MEG Platform Revamped ClickHouse with a Lakehouse Architecture
Architect's Guide
Architect's Guide
Mar 20, 2026 · Backend Development

How We Cut 1‑Second Query Times in a Legacy WAF Dashboard Using Redis Caching

Facing slow page loads in a legacy WAF reporting system, we dissected a 1000‑line Java method, introduced hourly aggregation, Redis auto‑increment counters, and scheduled synchronization, eliminating costly SQL scans and achieving sub‑second queries on 1.5 million logs, while outlining remaining optimization opportunities.

Data ArchivingJavaPerformance Optimization
0 likes · 12 min read
How We Cut 1‑Second Query Times in a Legacy WAF Dashboard Using Redis Caching
JD Tech Talk
JD Tech Talk
Mar 17, 2026 · Backend Development

How to Build a MyBatis Plugin that Shields Databases from Sudden Traffic Spikes

This article explains the challenges of sudden traffic bursts on applications and databases, outlines a MyBatis plugin design that intercepts SQL, uses fingerprint‑based throttling with configurable policies, and details the development, optimization, testing, and documentation steps performed with pair‑programming assistance.

MyBatisPerformance OptimizationSQL interceptor
0 likes · 9 min read
How to Build a MyBatis Plugin that Shields Databases from Sudden Traffic Spikes
LuTiao Programming
LuTiao Programming
Mar 14, 2026 · Backend Development

Why Your Spring Boot App Freezes at One Million Records – 5 Proven Techniques to Double Performance

When a Spring Boot application reaches millions of rows, it often suffers from OutOfMemoryErrors, slow queries, and high CPU, but by applying five proven strategies—pagination, streaming, batch processing, indexing, and asynchronous execution—you can halve memory usage and achieve up to ten‑fold speed gains.

Asynchronous ExecutionBatch ProcessingIndexing
0 likes · 11 min read
Why Your Spring Boot App Freezes at One Million Records – 5 Proven Techniques to Double Performance
Architecture & Thinking
Architecture & Thinking
Mar 13, 2026 · Databases

Why MySQL Deep Pagination Slows Down Your E‑commerce Site and How to Fix It

The article explains how deep pagination on massive MySQL tables causes full‑table scans, massive I/O, and memory pressure, then presents six concrete optimization techniques—including delayed join, cursor pagination, covering indexes, ID‑range pagination, caching, and partitioning—backed by a real‑world e‑commerce case study and detailed execution‑plan analysis.

IndexingPerformance OptimizationSQL
0 likes · 18 min read
Why MySQL Deep Pagination Slows Down Your E‑commerce Site and How to Fix It
Woodpecker Software Testing
Woodpecker Software Testing
Mar 10, 2026 · Operations

Uncovering Test Data Generation Bottlenecks and Proven Ways to Accelerate CI Pipelines

The article examines why traditional manual or full‑backup test data creation becomes a performance bottleneck in modern micro‑service, TB‑scale environments, identifies three structural imbalances—data‑dependency, generation‑logic, and semantic redundancy—and presents a three‑layered optimization framework plus engineering best‑practices that can cut data‑prep time by up to 68%.

AutomationCI/CDMicroservices
0 likes · 8 min read
Uncovering Test Data Generation Bottlenecks and Proven Ways to Accelerate CI Pipelines
Code Wrench
Code Wrench
Mar 8, 2026 · Artificial Intelligence

How to Build Low‑Latency AI‑Powered Video Calls with Go and WebRTC

This article breaks down the latency challenges of combining AI with WebRTC, compares edge and cloud processing architectures, and provides a detailed Go‑based implementation—including RTP interception, AI model integration, real‑time translation pipelines, and performance optimizations—for ultra‑responsive video conferencing.

AIGoPerformance Optimization
0 likes · 7 min read
How to Build Low‑Latency AI‑Powered Video Calls with Go and WebRTC
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Mar 5, 2026 · Artificial Intelligence

Mamba’s SSD Framework Shatters Serial Bottleneck, Outperforms vLLM and SGLang

The new Speculative Speculative Decoding (SSD) framework, built by the Mamba and FlashAttention authors, eliminates the serial draft‑verification bottleneck in LLM inference by running the draft model asynchronously, introducing a speculation cache and the Saguaro algorithm, which together deliver up to 5× speedup over autoregressive baselines and up to 2× over optimized engines on Llama‑3 and Qwen‑3, reshaping the latency‑throughput trade‑off.

Asynchronous ParallelismLLM InferencePerformance Optimization
0 likes · 9 min read
Mamba’s SSD Framework Shatters Serial Bottleneck, Outperforms vLLM and SGLang