Tagged articles

1919 articles

Page 1 of 20

May 20, 2026 · Databases

Why Can Redis Sustain Over 100k QPS? A Deep Technical Dive

The article explains how Redis achieves more than 100,000 queries per second by leveraging in‑memory storage, highly optimized data structures, a single‑threaded core with epoll‑based I/O multiplexing, optional I/O multithreading, and performance tricks such as pipelining and careful key sizing.

Data StructuresI/O MultiplexingIn-Memory Database

0 likes · 9 min read

Why Can Redis Sustain Over 100k QPS? A Deep Technical Dive

Xiaohongshu Tech REDtech

May 18, 2026 · Artificial Intelligence

CCD‑Aware Thread Orchestration Shatters Multi‑Core CPU Vector Search Performance Ceiling

The paper presents a CCD‑level load‑aware thread orchestration framework that boosts vector ANNS throughput up to 3.7×, cuts P999 tail latency by 30%‑90%, reduces L3 cache miss rates by 6%‑30% and CPU stall time by 20%‑80% on AMD EPYC multi‑chiplet CPUs.

ANNSCCDCPU cache

0 likes · 19 min read

CCD‑Aware Thread Orchestration Shatters Multi‑Core CPU Vector Search Performance Ceiling

Spring Full-Stack Practical Cases

May 17, 2026 · Backend Development

How to Accurately Measure Spring Boot Bean Creation Time for Performance Optimization

This article demonstrates a non‑intrusive, high‑precision method to track each Spring Boot bean's creation time by customizing ApplicationContextFactory, Environment, and BeanFactory, allowing package‑level exclusion and clear console reporting to pinpoint slow‑loading beans during startup.

Performance OptimizationSpring Bootapplicationcontextfactory

0 likes · 7 min read

How to Accurately Measure Spring Boot Bean Creation Time for Performance Optimization

MaGe Linux Operations

May 16, 2026 · Operations

How to Cut Nginx Response Time from 500 ms to 50 ms: A Practical Optimization Guide

By establishing baselines, methodically profiling logs, and applying layered tweaks—such as keepalive connections, gzip compression, proxy caching, worker tuning, HTTP/2, kernel parameters, and backend caching—this guide demonstrates how to reduce Nginx’s total response time from 500 ms to under 50 ms with measurable results.

GzipHTTP/2Keepalive

0 likes · 25 min read

How to Cut Nginx Response Time from 500 ms to 50 ms: A Practical Optimization Guide

Woodpecker Software Testing

May 12, 2026 · Operations

How AI Cut CI/CD Build Time from 12 Minutes to 98 Seconds in a FinTech Team

A FinTech team's CI pipeline saw build time jump to 12 minutes 37 seconds and test failures rise to 18%, but after deploying a lightweight AI analysis engine the hidden JUnit parameterized test caused resource contention was identified, prioritized fixes were generated, and overall build duration was reduced to under two minutes.

AIDevOpsPerformance Optimization

0 likes · 9 min read

How AI Cut CI/CD Build Time from 12 Minutes to 98 Seconds in a FinTech Team

Deepin Linux

May 11, 2026 · Fundamentals

Eliminate Memory Fragmentation: Understanding Memory Pools

The article explains how frequent dynamic allocations cause external and internal memory fragmentation, illustrates the problem with C++ examples, and shows that pre‑allocating a large contiguous block as a memory pool—managed via block division, free‑list tracking, and thread‑safe operations—significantly reduces fragmentation, improves allocation speed, and boosts concurrency performance.

CMemory FragmentationPerformance Optimization

0 likes · 30 min read

Eliminate Memory Fragmentation: Understanding Memory Pools

DataFunSummit

May 10, 2026 · Artificial Intelligence

Why Memory Is the Bottleneck for AI Agents and How MemOS Overcomes It

The article analyzes the critical role of memory in AI agents, compares model‑driven and application‑driven approaches, details the five‑layer MemOS architecture with three‑level memory coordination, and presents performance gains such as 100‑200% monthly cloud‑service growth, up to 72% token savings, and a 30% improvement in answer quality.

AI AgentEnterprise AILLM

0 likes · 18 min read

Why Memory Is the Bottleneck for AI Agents and How MemOS Overcomes It

Woodpecker Software Testing

May 8, 2026 · Artificial Intelligence

Beyond More Hardware: In‑Depth Strategies to Accelerate AI Safety Testing

The article dissects AI safety testing bottlenecks and presents four optimization dimensions—testing paradigm, data generation, execution architecture, and feedback loop—offering concrete techniques such as risk‑aware input filtering, gradient‑cache reuse, heterogeneous parallelism, and adaptive sampling that together cut testing time by several folds.

AI safety testingPerformance Optimizationadaptive sampling

0 likes · 8 min read

Beyond More Hardware: In‑Depth Strategies to Accelerate AI Safety Testing

Machine Heart

May 7, 2026 · Artificial Intelligence

Nvidia Endorses TokenSpeed: A Light‑Speed Agent Inference Engine Built in Two Months

TokenSpeed, an open‑source LLM inference engine designed for agent workloads, delivers TensorRT‑LLM‑level performance and vLLM‑level ease of use, outperforms TensorRT‑LLM by up to 11% throughput and halves latency on speculative decoding, and has earned Nvidia’s public recommendation.

Agent workloadsLLM inferenceNVIDIA Blackwell

0 likes · 8 min read

Nvidia Endorses TokenSpeed: A Light‑Speed Agent Inference Engine Built in Two Months

iQIYI Technical Product Team

May 7, 2026 · Mobile Development

How iQIYI Cut Memory Peaks by 60% and Boost Animated Image Loading by 75% with Cangjie on HarmonyOS

iQIYI built a high‑performance image library for HarmonyOS using Huawei's Cangjie language, replacing ArkTS bottlenecks, adding AVIF support and a three‑level cache, and achieved over 60% reduction in memory peak usage and up to 75% faster animated‑image loading, as demonstrated by detailed benchmarks and architectural analysis.

AVIFCangjieHarmonyOS

0 likes · 13 min read

How iQIYI Cut Memory Peaks by 60% and Boost Animated Image Loading by 75% with Cangjie on HarmonyOS

dbaplus Community

May 5, 2026 · Artificial Intelligence

How Claude Transforms SQL Workloads in the Dewu App Data Warehouse

The article examines Claude Code's deep integration into Dewu's e‑commerce data warehouse, outlining a decoupled cognitive‑runtime architecture, standardized I/O contracts, concrete performance gains across tagging, modeling, reporting and testing, and a comprehensive risk‑governance framework.

AI Agentic WorkflowCode LLMData Warehouse

0 likes · 23 min read

How Claude Transforms SQL Workloads in the Dewu App Data Warehouse

Old Zhang's AI Learning

May 5, 2026 · Artificial Intelligence

vLLM 0.20.1 Fixes Instability and Speed Issues for DeepSeek V4

The vLLM 0.20.1 patch, released shortly after 0.20.0, consolidates stability fixes and performance optimizations for DeepSeek V4, adds several bug fixes, updates installation instructions, and provides targeted upgrade recommendations for different user scenarios.

DeepSeek-V4GPU inferenceModel Deployment

0 likes · 9 min read

vLLM 0.20.1 Fixes Instability and Speed Issues for DeepSeek V4

SpringMeng

May 2, 2026 · Artificial Intelligence

10 Essential AI Prompt Templates Every Programmer Needs

This article presents ten practical AI prompt templates that help programmers efficiently handle requirement clarification, unit test generation, code explanation, refactoring, exception troubleshooting, performance tuning, SQL creation, knowledge documentation, design review, and cross‑language translation, each illustrated with concrete examples and usage tips.

AI promptingBackend DevelopmentCode review

0 likes · 13 min read

10 Essential AI Prompt Templates Every Programmer Needs

Architects' Tech Alliance

May 2, 2026 · Artificial Intelligence

Eight Chinese AI Chips Achieve Day‑Zero DeepSeek‑V4 Compatibility

The article explains how eight domestic AI chip makers—Huawei Ascend, Cambricon, HaiGuang, Moore Threads, Kunlun, Pingtouge, Muxi, and Tianshu—simultaneously completed full‑link compatibility, performance tuning, and stability verification for DeepSeek‑V4 on release day, detailing each vendor’s technical path, shared ecosystem breakthroughs, and the broader impact on the AI industry.

AI chipsDay0 adaptationDeepSeek-V4

0 likes · 11 min read

Eight Chinese AI Chips Achieve Day‑Zero DeepSeek‑V4 Compatibility

IT Services Circle

May 1, 2026 · Artificial Intelligence

10 Essential AI Prompt Templates Every Programmer Should Use

The article presents ten practical AI prompt templates that cover the full software development workflow—from requirement clarification and code generation to testing, refactoring, debugging, performance tuning, SQL optimization, documentation, design review, and cross‑language translation—helping developers get accurate, production‑ready results from AI.

AI promptingCode GenerationDebugging

0 likes · 12 min read

10 Essential AI Prompt Templates Every Programmer Should Use

IT Services Circle

Apr 30, 2026 · Fundamentals

Why the 1990s Windows Task Manager Fit in 80KB and Lessons for Modern Developers

In the 1990s, Windows Task Manager shipped at only about 80 KB, a feat driven by extreme hardware limits and meticulous engineering; Dave Plummer explains the design choices that kept it responsive and how those minimalist principles still guide efficient software development today.

Performance OptimizationResource ConstraintsSoftware Engineering

0 likes · 12 min read

Why the 1990s Windows Task Manager Fit in 80KB and Lessons for Modern Developers

Woodpecker Software Testing

Apr 29, 2026 · Artificial Intelligence

Adversarial Testing Performance Optimization: A Practical Guide for Test Experts

As AI deployments accelerate, the article explains why adversarial testing is inherently slow, identifies three coupling bottlenecks, and presents a four‑stage, data‑driven optimization framework that boosts throughput by up to 3.2× while preserving robustness, backed by real‑world financial‑AI case studies.

AI RobustnessPerformance Optimizationadversarial cache

0 likes · 7 min read

Adversarial Testing Performance Optimization: A Practical Guide for Test Experts

Java Tech Enthusiast

Apr 28, 2026 · Backend Development

How a Single Front‑End Change Dragged Four Backend Teams – The BFF Solution

A tiny front‑end tweak that required five microservice calls turned the front‑end into a glue layer, prompting a meeting with four backend teams, and the author explains how adopting a Backend‑for‑Frontend (BFF) pattern resolves such integration pain points with concrete examples and code.

API AggregationBFFBackend For Frontend

0 likes · 22 min read

How a Single Front‑End Change Dragged Four Backend Teams – The BFF Solution

James' Growth Diary

Apr 26, 2026 · Backend Development

How Claude Code Achieves Sub‑Second Cold Starts with Lazy Loading and Compile‑Time Feature Gating

The article dissects Claude Code's sub‑second cold‑start performance by detailing its lazy‑loading mechanism, compile‑time feature‑gate (DCE) via bun:bundle, runtime gating with GrowthBook, and the engineering trade‑offs of managing over 88 feature flags in a single‑file CLI bundle.

BunCLIPerformance Optimization

0 likes · 16 min read

How Claude Code Achieves Sub‑Second Cold Starts with Lazy Loading and Compile‑Time Feature Gating

dbaplus Community

Apr 26, 2026 · Operations

Why the Lsof Command Is an Underrated Lifesaver in Production

The article explains how the Linux lsof utility can quickly identify port conflicts, lingering deleted files, and file‑handle leaks, offering practical commands, real‑world case studies, advanced options, performance tips, and integration techniques for effective system troubleshooting.

LinuxPerformance Optimizationfile handles

0 likes · 12 min read

Why the Lsof Command Is an Underrated Lifesaver in Production

Shi's AI Notes

Apr 24, 2026 · Backend Development

How OpenAI’s Responses API WebSocket Revamp Accelerates Agent Workflows by 40%

OpenAI identified API‑overhead as the new bottleneck after faster model inference and introduced a persistent WebSocket connection that caches conversation state, overlaps request phases, and preserves the original API shape, delivering up to a 40% end‑to‑end latency reduction and dramatically higher TPS.

BackendOpenAIPerformance Optimization

0 likes · 11 min read

How OpenAI’s Responses API WebSocket Revamp Accelerates Agent Workflows by 40%

Machine Heart

Apr 24, 2026 · Artificial Intelligence

Cambricon Achieves Day‑0 Native Support for DeepSeek‑V4, Uniting Two Chinese AI Leaders

Cambricon leveraged its NeuWare stack and vLLM framework to deliver Day‑0 native support for DeepSeek‑V4‑flash (285 B) and DeepSeek‑V4‑pro (1.6 T), open‑sourcing the adaptation and showcasing rapid model migration alongside extreme performance optimizations across software and hardware layers.

AI inferenceCambriconDeepSeek-V4

0 likes · 5 min read

Cambricon Achieves Day‑0 Native Support for DeepSeek‑V4, Uniting Two Chinese AI Leaders

Baidu Intelligent Cloud Tech Hub

Apr 24, 2026 · Artificial Intelligence

LoongForge: Open‑Source Multimodal Training Framework Runs on GPU and Kunlun XPU with 45% Speedup

LoongForge is an open‑source, Megatron‑based multimodal training framework that unifies LLM, VLM, VLA and diffusion models, runs seamlessly on NVIDIA GPUs and Baidu Kunlun XPU, and delivers 15%‑45% end‑to‑end training acceleration while scaling linearly on thousands of cards.

GPUKunlun XPULoongForge

0 likes · 23 min read

LoongForge: Open‑Source Multimodal Training Framework Runs on GPU and Kunlun XPU with 45% Speedup

Java Architect Handbook

Apr 22, 2026 · Backend Development

How Changing Five Lines of Code Boosted API Throughput Over 10×

A low‑traffic B2B service struggled to meet a 500 req/s demand, achieving only 50 req/s with high CPU usage; through systematic profiling, lock analysis, async refactoring, thread‑pool tuning, and eliminating costly Spring bean creation, the team dramatically improved response times and throughput, revealing deeper CPU‑usage mysteries.

JavaPerformance OptimizationProfiling

0 likes · 16 min read

How Changing Five Lines of Code Boosted API Throughput Over 10×

Alibaba Cloud Big Data AI Platform

Apr 20, 2026 · Cloud Computing

How Alibaba Cloud’s Agentic Search Redefines Enterprise AI Search

The article analyzes Alibaba Cloud Elasticsearch’s shift from keyword‑based to Agent‑native search, detailing the Agent Native architecture, hybrid retrieval 2.0, FalconSeek engine performance gains of up to 300%, cost reductions of 40‑70%, and the ecosystem of ES Skills, cloud‑native enhancements, and observability that together enable a scalable AI search platform for enterprises.

AI searchAgentic ArchitectureCost reduction

0 likes · 13 min read

How Alibaba Cloud’s Agentic Search Redefines Enterprise AI Search

Architect Chen

Apr 20, 2026 · Backend Development

Mastering Nginx Static‑Dynamic Separation: Principles, Architecture & Config

This article explains how Nginx static‑dynamic separation works, why it boosts performance, the core design principles, typical deployment architectures, and provides a complete configuration example with caching and rate‑limiting to dramatically reduce backend load.

NGINXPerformance Optimizationcaching

0 likes · 5 min read

Mastering Nginx Static‑Dynamic Separation: Principles, Architecture & Config

Coder Trainee

Apr 19, 2026 · Backend Development

How to Optimize Performance and Deploy a Production‑Ready Blog System

This article walks through a complete performance‑optimization and deployment pipeline for a Spring Boot blog, covering multi‑level caching with Caffeine and Redis, database indexing and cursor pagination, read‑write splitting, asynchronous processing, rate limiting, Docker multi‑stage builds, Nginx reverse‑proxy setup, Actuator monitoring, custom metrics, health checks, alerting, JMeter load testing, and JVM tuning.

CaffeineDockerPerformance Optimization

0 likes · 17 min read

How to Optimize Performance and Deploy a Production‑Ready Blog System

Woodpecker Software Testing

Apr 18, 2026 · Operations

Deep Dive into Performance Optimization for Self‑Healing Test Scripts

The article examines why self‑healing test scripts increase runtime overhead, breaks down the underlying mechanisms, and presents four concrete optimization tactics—layered healing, locator caching, visual/semantic throttling, and asynchronous repair—backed by real‑world case data showing up to 43% faster regressions and 52% lower maintenance cost.

DevOpsPerformance OptimizationUI testing

0 likes · 8 min read

Deep Dive into Performance Optimization for Self‑Healing Test Scripts

Deepin Linux

Apr 18, 2026 · Fundamentals

Mastering Process Context Switching: What the CPU Actually Does

This article breaks down the fundamentals of process context switching, explaining CPU registers, program counters, the three-step switch routine, trigger conditions, performance impact, monitoring tools, and practical optimization techniques to help interview candidates answer confidently.

LinuxOperating SystemPerformance Optimization

0 likes · 29 min read

Mastering Process Context Switching: What the CPU Actually Does

ByteDance SE Lab

Apr 17, 2026 · Industry Insights

How DisCoGC Cuts Storage Costs by 20%: A Deep Dive into ByteStore’s New GC Paradigm

This article analyzes the DisCoGC algorithm introduced by ByteDance, explaining how its discard‑centric garbage collection eliminates the write‑amplification vs. space‑amplification trade‑off in log‑structured storage, details the engineering challenges of multi‑layer deployment, and presents production results showing up to 20% TCO reduction without impacting latency.

Cost reductionGarbage CollectionPerformance Optimization

0 likes · 19 min read

How DisCoGC Cuts Storage Costs by 20%: A Deep Dive into ByteStore’s New GC Paradigm

JD Tech

Apr 16, 2026 · Industry Insights

How JD Revolutionized Coupon Search with a Stream‑Batch Unified Architecture

This article analyzes JD's end‑to‑end upgrade of its retail coupon search infrastructure, detailing the business drivers, data‑skew challenges, the shift from dual KV and batch pipelines to a unified stream‑batch model built on Apache Doris, and the resulting performance, resource and stability gains across multiple scenarios.

Apache DorisBatch ProcessingCoupon Search

0 likes · 12 min read

How JD Revolutionized Coupon Search with a Stream‑Batch Unified Architecture

Ctrip Technology

Apr 16, 2026 · Big Data

How Ray + DuckDB Cut 9B-Row Attribution Queries from 40s to 15s

When attribution analysis on over 900 million rows slowed to more than 40 seconds and threatened cluster stability, Ctrip's smart attribution team rebuilt the architecture with Ray and DuckDB, achieving sub‑15‑second query times, 160 % performance gain, and complete resource isolation.

Attribution AnalysisBig DataDuckDB

0 likes · 22 min read

How Ray + DuckDB Cut 9B-Row Attribution Queries from 40s to 15s

DataFunTalk

Apr 16, 2026 · Big Data

How Xiaohongshu Cut Data Architecture Costs by Two‑Thirds with Incremental Computing

This article details Xiaohongshu's data platform evolution from a simple ClickHouse‑based ad‑hoc system to a Lambda‑style architecture and finally a lakehouse solution, highlighting how the adoption of a new incremental computing model reduced architectural complexity, resource consumption and development effort each to roughly one‑third while delivering sub‑second query performance on petabyte‑scale data.

Big DataData ArchitectureLakehouse

0 likes · 21 min read

How Xiaohongshu Cut Data Architecture Costs by Two‑Thirds with Incremental Computing

Architect Chen

Apr 16, 2026 · Big Data

Supercharge Kafka Consumer Performance: Parallelism, Batching, and Multithreading

This guide explains practical techniques to dramatically increase Kafka consumer throughput, including scaling consumer instances or partitions, tuning fetch and poll parameters, and implementing a multithreaded consumer model, while also covering hardware, JVM, and OS optimizations and monitoring recommendations.

Batch FetchConsumer ParallelismKafka

0 likes · 5 min read

Supercharge Kafka Consumer Performance: Parallelism, Batching, and Multithreading

Qborfy AI

Apr 16, 2026 · Artificial Intelligence

How Trace Analysis Turns AI Agents from Black Boxes into Optimized Systems

Trace analysis converts the opaque decision‑making of AI agents into observable data, enabling systematic collection, parallel error detection, targeted improvements, and iterative experimentation, while revealing common failure patterns, budgeting trade‑offs, over‑fitting risks, and cost‑optimization opportunities through a reusable Trace Analyzer Skill framework.

AIAgent DebuggingLLM

0 likes · 33 min read

How Trace Analysis Turns AI Agents from Black Boxes into Optimized Systems

Woodpecker Software Testing

Apr 15, 2026 · Artificial Intelligence

How AI Testing Tools Redefine Performance Optimization: A New Paradigm

Amid exploding large‑model deployments, AI teams struggle with slow test feedback, but AI‑native testing tools—through intelligent load modeling, inference‑layer root‑cause analysis, and self‑healing loops—demonstrate concrete latency reductions, resource savings, and faster issue remediation.

AI testingMLOpsObservability

0 likes · 6 min read

How AI Testing Tools Redefine Performance Optimization: A New Paradigm

Java Web Project

Apr 15, 2026 · Backend Development

How We Cut Spring Boot Startup from 12 s to 3 s with GraalVM Native Image

This article walks through converting a Spring Boot order‑query microservice to a GraalVM Native Image, detailing environment setup, common build pitfalls with concrete code fixes, Docker multi‑stage packaging, K8s scaling comparison, performance benchmarks, CI/CD integration, and guidance on when Native Image is appropriate.

DockerKubernetesPerformance Optimization

0 likes · 12 min read

How We Cut Spring Boot Startup from 12 s to 3 s with GraalVM Native Image

Tencent Technical Engineering

Apr 12, 2026 · Operations

How TencentOS Engineers Revamped Linux Swap for 5‑20% Performance Gains

This article translates and consolidates three LWN analyses of the Linux swap subsystem modernization led by TencentOS kernel engineer Kairui Song, detailing the introduction of swap tables, removal of the swap map, virtual swap concepts, code changes, performance improvements of up to 20 % and the broader impact on the kernel community.

Linux kernelMemory ManagementPerformance Optimization

0 likes · 27 min read

How TencentOS Engineers Revamped Linux Swap for 5‑20% Performance Gains

Deepin Linux

Apr 12, 2026 · Fundamentals

Why TLB Matters: Unlocking Linux Kernel Performance

This article explains the role of the Translation Lookaside Buffer (TLB) in Linux virtual‑memory translation, covering basic address concepts, page‑table mechanics, TLB operation, flush and synchronization strategies, hardware vs software management, Linux kernel APIs, and a practical C benchmark comparing sequential and random memory accesses.

CacheOperating SystemsPerformance Optimization

0 likes · 36 min read

Why TLB Matters: Unlocking Linux Kernel Performance

Old Zhang's AI Learning

Apr 11, 2026 · Artificial Intelligence

Mastering SGLang: KV Cache and RadixAttention for Faster LLM Inference

This article reviews the DeepLearning.ai short course on SGLang, explains why large‑language‑model inference is slow, details how KV Cache reduces the computation from O(n²) to O(n), introduces RadixAttention for cross‑request caching, and presents code examples and benchmark results showing up to 10× speedup in real‑world RAG scenarios.

KV cacheLLM inferencePerformance Optimization

0 likes · 13 min read

Mastering SGLang: KV Cache and RadixAttention for Faster LLM Inference

ITPUB

Apr 10, 2026 · Backend Development

How a Simple Refactor and Parallelism Cut Java Loop Time from 26s to 0.7s

A new team member transformed a painfully slow Java data‑processing routine—originally taking 26,856 ms—by refactoring nested loops, extracting repeated calculations, and introducing a thread‑pool for parallel execution, reducing runtime to just 748 ms, and the article walks through the before‑and‑after code and key techniques.

JavaPerformance Optimizationparallel computing

0 likes · 8 min read

How a Simple Refactor and Parallelism Cut Java Loop Time from 26s to 0.7s

Woodpecker Software Testing

Apr 10, 2026 · Operations

How Adversarial Testing Drives Hidden Performance Gains

Adversarial testing transforms performance optimization by injecting extreme, realistic failures—such as cache avalanches, CDN outages, or slow SQL—to expose fragile boundaries, tighten observability, and create a rapid, evidence‑driven feedback loop that prevents costly production incidents.

MicroservicesObservabilityPerformance Optimization

0 likes · 8 min read

How Adversarial Testing Drives Hidden Performance Gains

DataFunTalk

Apr 10, 2026 · Big Data

How Xiaohongshu Cut Data Architecture Costs by Two‑Thirds with Incremental Computing

This article analyzes Xiaohongshu's data platform evolution—from a simple ClickHouse‑based analytics layer to a Lambda architecture and finally a lakehouse design—highlighting how adopting a new incremental computing model reduced architecture complexity, resource consumption, and development effort each to roughly one‑third while delivering sub‑second query performance on petabyte‑scale data.

Big DataData ArchitectureLakehouse

0 likes · 22 min read

Black & White Path

Apr 8, 2026 · Artificial Intelligence

Run Massive AI Models on a Single PC: The 1‑Bit LLM Revolution

Microsoft’s open‑source bitnet.cpp transforms 100‑billion‑parameter LLM inference from GPU‑only to ordinary CPUs by replacing floating‑point matrix multiplication with integer add‑subtract, cutting energy use by 82 %, memory by 90 % and delivering up to 6× speed on x86/ARM hardware.

1-bit LLMBitNetCPU inference

0 likes · 7 min read

Run Massive AI Models on a Single PC: The 1‑Bit LLM Revolution

JavaGuide

Apr 7, 2026 · Information Security

Why Brute‑Force Won’t Cut It for Sensitive‑Word Filtering (And What Actually Works)

The article walks through the evolution of sensitive‑word filtering—from naïve brute‑force scanning to Trie, Aho‑Corasick automaton, Double‑Array Trie, and DFA implementations—detailing their algorithms, time/space complexities, concrete Java code examples, performance trade‑offs, high‑concurrency optimizations, and practical production advice for building a robust content‑moderation system.

Aho-CorasickDFADouble-Array Trie

0 likes · 26 min read

Why Brute‑Force Won’t Cut It for Sensitive‑Word Filtering (And What Actually Works)

James' Growth Diary

Apr 6, 2026 · Artificial Intelligence

10 Practical LangChain Performance Hacks to Speed Up and Cut Costs

This article presents ten concrete techniques—including in‑memory and Redis caching, semantic caching, parallel execution, batch processing, prompt compression, model routing, streaming output, and connection‑pool reuse—to dramatically reduce latency and token costs in production LangChain applications.

LangChainNode.jsPerformance Optimization

0 likes · 14 min read

10 Practical LangChain Performance Hacks to Speed Up and Cut Costs

ITPUB

Apr 2, 2026 · Operations

Why Your SSD Slows Down Over Time and How to Fix It on Linux

This guide explains the reasons behind SSD performance degradation, such as write‑amplification and garbage collection, and provides practical Linux techniques—including enabling TRIM, maintaining free space, reducing unnecessary writes, and using smartctl—to restore and preserve SSD speed.

.trimLinuxPerformance Optimization

0 likes · 6 min read

Why Your SSD Slows Down Over Time and How to Fix It on Linux

Tencent Architect

Apr 2, 2026 · Operations

How Modernizing Linux Swap Boosts Performance and Cuts Memory Overhead

This article translates and consolidates Jonathan Corbet’s three-part “Modernizing swapping” series, detailing the introduction of swap tables, removal of swap maps, and virtual swap concepts that together improve Linux kernel swap performance by up to 20%, reduce metadata memory by up to 30%, and simplify the codebase.

Linux kernelPerformance Optimizationswap map

0 likes · 27 min read

How Modernizing Linux Swap Boosts Performance and Cuts Memory Overhead

AI Architecture Path

Apr 1, 2026 · Frontend Development

How Pretext Eliminates DOM Reflows for Ultra‑Fast Text Measurement

Pretext, a zero‑DOM, high‑performance text measurement engine created by React core contributor chenglou, uses Canvas‑based calculations and a two‑stage prepare/layout workflow to avoid layout reflows, delivering up to 500× speed gains for virtual scrolling, rich‑text rendering, and AI‑driven UI layout predictions.

Performance OptimizationPretexttext measurement

0 likes · 7 min read

How Pretext Eliminates DOM Reflows for Ultra‑Fast Text Measurement

Deepin Linux

Mar 28, 2026 · Fundamentals

Unlocking Linux Performance: A Deep Dive into NUMA Architecture

This article explains the core principles of NUMA, its deep integration with the Linux kernel, practical memory‑node and scheduling mechanisms, real‑world database and virtualization use cases, and step‑by‑step commands for inspecting and tuning NUMA on modern servers.

Linux kernelMemory ManagementNUMA

0 likes · 23 min read

Unlocking Linux Performance: A Deep Dive into NUMA Architecture

vivo Internet Technology

Mar 25, 2026 · Industry Insights

How Vivo Scaled Marketing Automation with Presto, Bitmap, and StarRocks

This case study details how Vivo’s marketing automation platform evolved its data‑driven architecture—from a Presto‑based wide‑table design, through a Bitmap optimization, to a StarRocks migration—addressing performance bottlenecks, reducing resource costs, and enhancing data security.

Big DataBitmapData Architecture

0 likes · 11 min read

How Vivo Scaled Marketing Automation with Presto, Bitmap, and StarRocks

Top Architect

Mar 25, 2026 · Backend Development

Boost API Performance 10× with a Three‑Tier Cache Pyramid in Spring Boot 3

This article explains how to design and implement a three‑level cache pyramid (Caffeine → Redis → MySQL) in Spring Boot 3, covering configuration, a reusable CacheTemplate, hot‑key handling, random TTL, warm‑up, monitoring, and load‑test results that show latency dropping from tens of milliseconds to a few milliseconds while cutting CPU and network usage dramatically.

Backend DevelopmentCaffeineJava

0 likes · 11 min read

Boost API Performance 10× with a Three‑Tier Cache Pyramid in Spring Boot 3

Alibaba Cloud Developer

Mar 25, 2026 · Databases

How AliSQL AI Diagnoses and Eliminates MySQL Replication Lag

This article analyzes the severe replication‑delay issues in MySQL master‑slave setups, identifies four typical workload patterns that cause lag, demonstrates how AliSQL's AI assistant pinpoints the root causes, and explains the kernel‑level optimizations that completely remove the delay.

AI DiagnosisAliSQLPerformance Optimization

0 likes · 13 min read

How AliSQL AI Diagnoses and Eliminates MySQL Replication Lag

AI Explorer

Mar 23, 2026 · Artificial Intelligence

How Unsloth Studio Turns Local AI Training into a Simple, High‑Performance Experience

Unsloth Studio, an open‑source local AI studio, combines a sleek web UI with a custom Triton kernel that claims up to 2× faster training, 70% VRAM savings (80% for RL), supports over 500 models, visual data‑recipe workflows, and both desktop and Python library usage for developers, researchers, and hobbyists.

AI StudioLocal AIModel Training

0 likes · 7 min read

How Unsloth Studio Turns Local AI Training into a Simple, High‑Performance Experience

Baidu Geek Talk

Mar 23, 2026 · Databases

How Baidu’s MEG Platform Revamped ClickHouse with a Lakehouse Architecture

This article analyzes the challenges of scaling ClickHouse within Baidu’s MEG data platform and details a lake‑house solution that decouples storage and compute, integrates a meta‑service for transparent data access, optimizes query performance through caching, data roll‑up and layout tuning, and introduces a unified query gateway that gracefully falls back to Spark for complex workloads.

ClickHouseData PlatformLakehouse

0 likes · 25 min read

How Baidu’s MEG Platform Revamped ClickHouse with a Lakehouse Architecture

Architect's Guide

Mar 20, 2026 · Backend Development

How We Cut 1‑Second Query Times in a Legacy WAF Dashboard Using Redis Caching

Facing slow page loads in a legacy WAF reporting system, we dissected a 1000‑line Java method, introduced hourly aggregation, Redis auto‑increment counters, and scheduled synchronization, eliminating costly SQL scans and achieving sub‑second queries on 1.5 million logs, while outlining remaining optimization opportunities.

JavaPerformance OptimizationSQL

0 likes · 12 min read

How We Cut 1‑Second Query Times in a Legacy WAF Dashboard Using Redis Caching

JD Tech Talk

Mar 17, 2026 · Backend Development

How to Build a MyBatis Plugin that Shields Databases from Sudden Traffic Spikes

This article explains the challenges of sudden traffic bursts on applications and databases, outlines a MyBatis plugin design that intercepts SQL, uses fingerprint‑based throttling with configurable policies, and details the development, optimization, testing, and documentation steps performed with pair‑programming assistance.

MyBatisPerformance OptimizationSQL interceptor

0 likes · 9 min read

How to Build a MyBatis Plugin that Shields Databases from Sudden Traffic Spikes

Architecture & Thinking

Mar 13, 2026 · Databases

Why MySQL Deep Pagination Slows Down Your E‑commerce Site and How to Fix It

The article explains how deep pagination on massive MySQL tables causes full‑table scans, massive I/O, and memory pressure, then presents six concrete optimization techniques—including delayed join, cursor pagination, covering indexes, ID‑range pagination, caching, and partitioning—backed by a real‑world e‑commerce case study and detailed execution‑plan analysis.

Performance OptimizationSQLdeep pagination

0 likes · 18 min read

Why MySQL Deep Pagination Slows Down Your E‑commerce Site and How to Fix It

Woodpecker Software Testing

Mar 10, 2026 · Operations

Uncovering Test Data Generation Bottlenecks and Proven Ways to Accelerate CI Pipelines

The article examines why traditional manual or full‑backup test data creation becomes a performance bottleneck in modern micro‑service, TB‑scale environments, identifies three structural imbalances—data‑dependency, generation‑logic, and semantic redundancy—and presents a three‑layered optimization framework plus engineering best‑practices that can cut data‑prep time by up to 68%.

AutomationMicroservicesPerformance Optimization

0 likes · 8 min read

Uncovering Test Data Generation Bottlenecks and Proven Ways to Accelerate CI Pipelines

Code Wrench

Mar 8, 2026 · Artificial Intelligence

How to Build Low‑Latency AI‑Powered Video Calls with Go and WebRTC

This article breaks down the latency challenges of combining AI with WebRTC, compares edge and cloud processing architectures, and provides a detailed Go‑based implementation—including RTP interception, AI model integration, real‑time translation pipelines, and performance optimizations—for ultra‑responsive video conferencing.

AIEdge ComputingGo

0 likes · 7 min read

How to Build Low‑Latency AI‑Powered Video Calls with Go and WebRTC

Machine Learning Algorithms & Natural Language Processing

Mar 5, 2026 · Artificial Intelligence

Mamba’s SSD Framework Shatters Serial Bottleneck, Outperforms vLLM and SGLang

The new Speculative Speculative Decoding (SSD) framework, built by the Mamba and FlashAttention authors, eliminates the serial draft‑verification bottleneck in LLM inference by running the draft model asynchronously, introducing a speculation cache and the Saguaro algorithm, which together deliver up to 5× speedup over autoregressive baselines and up to 2× over optimized engines on Llama‑3 and Qwen‑3, reshaping the latency‑throughput trade‑off.

Asynchronous ParallelismLLM inferencePerformance Optimization

0 likes · 9 min read

Mamba’s SSD Framework Shatters Serial Bottleneck, Outperforms vLLM and SGLang

StarRocks

Mar 5, 2026 · Big Data

How Fanatics Scaled to PB‑Level Data with StarRocks & Apache Iceberg Lakehouse

Fanatics unified its fragmented data stack by building a StarRocks‑powered Lakehouse on Apache Iceberg, replacing Redshift, Snowflake, Athena, and Druid, which cut costs by up to 95%, delivered sub‑second dashboard queries on petabyte‑scale data, and enabled real‑time and historical analytics on a single platform.

Apache IcebergData ArchitectureFanatics

0 likes · 10 min read

How Fanatics Scaled to PB‑Level Data with StarRocks & Apache Iceberg Lakehouse

DeWu Technology

Mar 4, 2026 · Backend Development

How the Multiplicative Tree Framework Enables Instant Formula Deployment and Stable High‑Performance Ranking

The article details the design and evolution of the Multiplicative Tree framework—from version 1.0 to 3.0—showing how a DSL‑based, compile‑time‑checked configuration system delivers instant formula deployment, robust stability safeguards, and significant performance gains for multi‑objective ranking models.

CDSLJava

0 likes · 18 min read

How the Multiplicative Tree Framework Enables Instant Formula Deployment and Stable High‑Performance Ranking

Woodpecker Software Testing

Mar 4, 2026 · Artificial Intelligence

Deep Dive into Adversarial Testing Performance Optimization for AI Systems

The article examines Adversarial Testing Performance Optimization (ATPO) as a new industrial-quality paradigm, detailing how adversarial samples expose hidden performance bottlenecks across AI pipelines, presenting three typical adversarial loads with corresponding optimization targets, common implementation pitfalls, and emerging intelligent approaches using reinforcement learning and digital twins.

AI pipelinesDigital TwinPerformance Optimization

0 likes · 8 min read

Deep Dive into Adversarial Testing Performance Optimization for AI Systems

Code Wrench

Mar 4, 2026 · Backend Development

How to Build a 50‑Player Real‑Time Battle Server in Go: Architecture & Performance

This article explains how to design a Go‑based backend for a 50‑player real‑time battle game, covering concurrency models, GC tuning, matching algorithms, fixed‑frame loops, AOI optimization, KCP networking, and performance‑boosting techniques such as object pooling and command batching.

Game BackendGoPerformance Optimization

0 likes · 8 min read

How to Build a 50‑Player Real‑Time Battle Server in Go: Architecture & Performance

JD Cloud Developers

Mar 3, 2026 · Mobile Development

How to Integrate AI into Mobile Apps Without Sacrificing User Experience

This article examines the practical challenges of adding AI features to mobile clients, highlighting device fragmentation, performance trade‑offs, user pain points, and a layered approach that balances lightweight models, graceful degradation, and edge‑cloud collaboration to keep the experience smooth for the majority of users.

AI integrationAR gesturesMobile Development

0 likes · 15 min read

How to Integrate AI into Mobile Apps Without Sacrificing User Experience

Old Zhang's AI Learning

Feb 28, 2026 · Artificial Intelligence

How OpenAI Engineers Leverage Codex: 6 Proven Best Practices

The article reveals how OpenAI’s engineering teams integrate Codex into daily workflows, detailing seven core application scenarios—from code understanding and refactoring to performance optimization and flow maintenance—and presents six concrete best‑practice guidelines for maximizing AI‑assisted development efficiency.

AI code generationCodexPerformance Optimization

0 likes · 7 min read

How OpenAI Engineers Leverage Codex: 6 Proven Best Practices

Woodpecker Software Testing

Feb 28, 2026 · Operations

Boost Large Language Model Testing Performance: Essential Strategies for Test Engineers

The article outlines four engineering‑driven approaches—layered test granularity, cache‑driven golden sample pools, lightweight evaluation proxies, and test‑as‑code with resource‑aware scheduling—to dramatically cut LLM testing latency, improve reliability, and lower costs, illustrated with real‑world banking, government, and medical case studies.

CacheEvaluation ProxyPerformance Optimization

0 likes · 8 min read

Boost Large Language Model Testing Performance: Essential Strategies for Test Engineers

Woodpecker Software Testing

Feb 27, 2026 · Artificial Intelligence

How Test Experts Can Accelerate Model Evaluation and Boost Performance

The article analyzes why over 73% of AI projects stall during model evaluation and presents three optimization paths—low‑latency pipelines, multidimensional bias diagnostics, and lightweight online probes—that together cut evaluation time by up to 13× and improve fault detection from hours to seconds.

AI testingModel EvaluationPerformance Optimization

0 likes · 6 min read

How Test Experts Can Accelerate Model Evaluation and Boost Performance

PaperAgent

Feb 27, 2026 · Artificial Intelligence

How DualPath Eliminates Storage Bandwidth Bottlenecks in Agentic LLM Inference

This article analyzes the DualPath architecture that redesigns KV‑Cache data paths to overcome storage‑NIC saturation in Prefill‑Decode LLM systems, presenting theoretical proofs, detailed engineering solutions, and extensive offline and online benchmarks that demonstrate up to 2.25× performance gains.

DualPathLLM inferencePerformance Optimization

0 likes · 9 min read

How DualPath Eliminates Storage Bandwidth Bottlenecks in Agentic LLM Inference

Raymond Ops

Feb 26, 2026 · Operations

What Core Skills Do 500k‑CNY Ops Engineers Master?

This article breaks down the essential technical and soft‑skill competencies—ranging from deep Linux kernel knowledge and database optimization to cloud‑native Kubernetes expertise, observability, automation, cost‑saving architecture, and security—that distinguish high‑salary operations engineers and provides a practical roadmap for achieving them.

KubernetesObservabilityOperations

0 likes · 38 min read

What Core Skills Do 500k‑CNY Ops Engineers Master?

Linux Tech Enthusiast

Feb 26, 2026 · Operations

A Comprehensive Guide to Linux Performance Optimization

This article walks through Linux performance optimization by explaining core metrics such as throughput and latency, describing how to interpret average load, CPU usage, context switches, memory management, and swap, and showing step‑by‑step usage of tools like vmstat, pidstat, perf, and dstat with concrete command examples and analysis cases.

CPULinuxMemory

0 likes · 37 min read

A Comprehensive Guide to Linux Performance Optimization

Big Data Tech Team

Feb 12, 2026 · Big Data

Mastering the DWS Layer: Core Strategies for Scalable Data Warehouses

This article provides a comprehensive, business‑driven analysis of the Data Warehouse Service (DWS) layer, covering its core positioning, design goals, modeling and aggregation tactics, storage optimizations, typical challenges with practical solutions, and best‑practice recommendations for building efficient, cost‑effective data services.

DWS LayerData WarehousePerformance Optimization

0 likes · 8 min read

Mastering the DWS Layer: Core Strategies for Scalable Data Warehouses

Mike Chen's Internet Architecture

Feb 12, 2026 · Backend Development

Boost Web Performance 5× with Nginx Static‑Dynamic Separation Architecture

This article explains how separating static and dynamic traffic with Nginx, configuring precise location rules, cache headers, and kernel optimizations can increase throughput by three to five times in high‑concurrency web architectures while reducing backend load and improving maintainability.

Backend ArchitecturePerformance Optimizationdynamic routing

0 likes · 4 min read

Boost Web Performance 5× with Nginx Static‑Dynamic Separation Architecture

macrozheng

Feb 12, 2026 · Fundamentals

How Time Slices, Hyper‑Threading, and Context Switching Enable Multithreading

The article explains why modern CPUs, even single‑core ones, can run multiple threads by using short time slices, hyper‑threading hardware, and context‑switch mechanisms, and it discusses the costs, Linux monitoring tools, scheduling strategies, and practical ways to reduce switching overhead.

CPU schedulingHyper-threadingOperating Systems

0 likes · 10 min read

How Time Slices, Hyper‑Threading, and Context Switching Enable Multithreading

Baidu Geek Talk

Feb 9, 2026 · Databases

How Mantle Redefined Cloud Object Storage Metadata for Billion‑File Scale

This article recounts how Baidu's storage team tackled the performance and scalability limits of traditional object storage by redesigning metadata handling with the Mantle and MantleX architectures, introducing a centralized IndexNode, strong consistency, delta‑record writes, and a seamless single‑node to distributed transition for massive file systems.

FilesystemPerformance OptimizationScalability

0 likes · 37 min read

How Mantle Redefined Cloud Object Storage Metadata for Billion‑File Scale

Xiaohongshu Tech REDtech

Feb 5, 2026 · Databases

How RedSQL Supercharged MySQL Performance and Achieved Zero‑Data‑Loss Replication

This article details Xiaohongshu's RedSQL MySQL kernel project, describing three major solutions—high‑throughput seckill optimization, a Binlog Server‑based zero‑data‑loss replication scheme, and second‑level DDL column addition—along with additional kernel enhancements that together delivered multi‑fold performance gains and improved stability.

DDLData ConsistencyPerformance Optimization

0 likes · 12 min read

How RedSQL Supercharged MySQL Performance and Achieved Zero‑Data‑Loss Replication

Java Tech Enthusiast

Feb 5, 2026 · Backend Development

Boost SpringBoot Debugging: Seamless Integration with Hera Log Platform

This guide explains how to integrate the Hera log platform into SpringBoot applications, covering architecture, Maven dependencies, YAML configuration, custom field providers, trace enablement, console usage, performance tuning, high‑availability design, and common pitfalls to dramatically improve log‑search efficiency in distributed systems.

Distributed TracingHeraPerformance Optimization

0 likes · 14 min read

Boost SpringBoot Debugging: Seamless Integration with Hera Log Platform

Java Architect Handbook

Feb 3, 2026 · Backend Development

Speeding Up 100k MySQL Inserts: From 5 Minutes to 3 Seconds in Java

This article walks through a real‑world data‑migration case where 100,000 rows were moved from an old system to a new one, showing how naive per‑row inserts took five minutes and how a series of optimizations—batch SQL, JDBC batch mode, and multithreaded parallelism—reduced the runtime to just three seconds, while also covering common pitfalls and the final high‑performance implementation.

Batch InsertJDBCJava

0 likes · 11 min read

Speeding Up 100k MySQL Inserts: From 5 Minutes to 3 Seconds in Java

Xiaohongshu Tech REDtech

Jan 30, 2026 · Backend Development

How Java Virtual Threads Cut Latency by 31× and Slash CPU Use in Production

This article explains the principles of Java virtual threads, compares them with traditional platform threads, details RedJDK21’s implementation and performance improvements—including up to 31‑fold latency reduction and 24% CPU savings—in large‑scale services at XiaoHongShu, and discusses migration challenges, lock handling, monitoring, and future roadmap.

JVMJavaPerformance Optimization

0 likes · 29 min read

How Java Virtual Threads Cut Latency by 31× and Slash CPU Use in Production

java1234

Jan 30, 2026 · Backend Development

How to Reduce MyBatis Batch Insert from 5 Minutes to 3 Seconds? Three Key Optimizations

The article walks through three concrete optimizations—batch SQL, JDBC batch mode with rewriteBatchedStatements, and multithreaded parallel inserts—that shrink a 100,000‑row MyBatis insertion from five minutes to three seconds, while highlighting configuration details, performance gains, and common pitfalls.

Batch InsertJDBCJava

0 likes · 10 min read

How to Reduce MyBatis Batch Insert from 5 Minutes to 3 Seconds? Three Key Optimizations

Java Companion

Jan 29, 2026 · Backend Development

How to Cut MyBatis Batch Insert Time from 5 Minutes to 3 Seconds: Three Key Optimizations

The article walks through turning a naïve MyBatis loop that took five minutes to insert 100,000 rows into a high‑performance solution that finishes in three seconds by applying batch SQL, JDBC batch mode with rewriteBatchedStatements, and multithreaded parallel execution, while highlighting pitfalls and best‑practice configurations.

Batch InsertExecutorType.BATCHJDBC

0 likes · 9 min read

How to Cut MyBatis Batch Insert Time from 5 Minutes to 3 Seconds: Three Key Optimizations

Mike Chen's Internet Architecture

Jan 28, 2026 · Backend Development

Boost Web Performance: Master Nginx Static‑Dynamic Separation

This article explains how Nginx can separate static assets from dynamic requests using location rules and reverse‑proxying, provides a complete configuration example, and details the performance gains from zero‑copy file serving, gzip compression, caching headers, and CDN integration.

NGINXPerformance Optimizationcaching

0 likes · 5 min read

Boost Web Performance: Master Nginx Static‑Dynamic Separation

Deepin Linux

Jan 28, 2026 · Fundamentals

Unlock Linux Performance: Master Memory Alignment and Struct Optimization

This article explains the core principles of memory alignment on Linux, shows how misaligned data harms CPU cache and execution speed, provides concrete C code examples and benchmark results, and offers practical techniques—including compiler directives and struct layout tricks—to achieve optimal performance.

C programmingLinuxPerformance Optimization

0 likes · 22 min read

Unlock Linux Performance: Master Memory Alignment and Struct Optimization

dbaplus Community

Jan 26, 2026 · Cloud Native

How Starbucks China Revamped Its Log Platform: From VMs to Cloud‑Native Kubernetes with 80% Faster Queries

Starbucks China’s logging team migrated several petabytes of logs from legacy VM‑based Elasticsearch clusters to a cloud‑native bare‑metal Kubernetes platform, upgrading ES from 7.x to 8.x, containerizing components, optimizing storage and Kafka, and achieving up to 80% query speed gains, 30% CPU reduction, and 200% write‑throughput improvement.

Performance Optimizationdata pipelinelog platform

0 likes · 25 min read

How Starbucks China Revamped Its Log Platform: From VMs to Cloud‑Native Kubernetes with 80% Faster Queries

BirdNest Tech Talk

Jan 24, 2026 · Artificial Intelligence

How to Build an AI Comic‑Generating Agent with LangGraphGo and Skills

This article walks through constructing a multi‑step AI comic‑generation agent using the LangGraphGo framework and the GoSkills plugin system, covering architecture design, declarative tool definitions, automatic configuration discovery, parameter conversion, code implementation, common pitfalls, best practices, and performance optimizations.

AI agentsGoSkillsLLM tool integration

0 likes · 22 min read

How to Build an AI Comic‑Generating Agent with LangGraphGo and Skills

Deepin Linux

Jan 24, 2026 · Fundamentals

Unlocking Linux Performance: A Deep Dive into io_uring and Its Advantages

This comprehensive guide explains why traditional I/O models become bottlenecks in high‑performance computing, introduces the modern io_uring framework with its submission and completion queues, walks through its design goals, core concepts, workflow, performance comparisons, optimization tips, real‑world use cases, and provides complete C examples for practical adoption.

C programmingLinuxPerformance Optimization

0 likes · 48 min read

Unlocking Linux Performance: A Deep Dive into io_uring and Its Advantages

Volcano Engine Developer Services

Jan 21, 2026 · Operations

How Tail‑Based Sampling Boosts Distributed Tracing Accuracy While Cutting Costs

This article explains the challenges of accurate RED metric collection in high‑traffic microservices, compares head‑based and tail‑based sampling, and details Volcano Engine APMPlus's multi‑level, hash‑routed tail sampling design, performance optimizations, and real‑world evaluation results.

APMDistributed TracingKubernetes

0 likes · 13 min read

How Tail‑Based Sampling Boosts Distributed Tracing Accuracy While Cutting Costs

Java Architect Handbook

Jan 21, 2026 · Backend Development

Why OpenFeign’s First Call Is Slow and How to Fix It

The article analyzes why the first OpenFeign call in micro‑service systems incurs seconds of latency, breaks down five root causes such as lazy client initialization, dynamic proxy creation, load‑balancer cold start, network handshake, and hidden dependencies, and provides concrete verification steps and four practical optimizations to move the cost to application start‑up.

Feign clientJavaMicroservices

0 likes · 15 min read

Why OpenFeign’s First Call Is Slow and How to Fix It

Linux Tech Enthusiast

Jan 21, 2026 · Fundamentals

Understanding TCP: Protocol Basics, Handshakes, States, and Performance Optimizations

TCP is a connection‑oriented, reliable, byte‑stream transport protocol; this article explains its header fields, state diagram, three‑way handshake, four‑way termination, TIME_WAIT handling, optimization techniques, and contrasts it with UDP, providing detailed Linux commands and kernel parameters.

HandshakeLinuxPerformance Optimization

0 likes · 26 min read

Understanding TCP: Protocol Basics, Handshakes, States, and Performance Optimizations

大转转FE

Jan 21, 2026 · Frontend Development

Boost Frontend Efficiency: How zzChromeTools Eliminates Hidden Time Sinks

This article explains how the zzChromeTools Chrome extension tackles the often‑overlooked “invisible time killers” in frontend development by injecting AOP‑style hooks into the main world, capturing beacon requests, and presenting them in a lightweight DevTools panel, dramatically reducing cognitive load and debugging time.

Chrome ExtensionData TrackingMV3

0 likes · 27 min read

Boost Frontend Efficiency: How zzChromeTools Eliminates Hidden Time Sinks

Xiaohongshu Tech REDtech

Jan 19, 2026 · Databases

How Merged Seckill Boosts MySQL Write Throughput 5× for High‑Traffic E‑Commerce

The article details a MySQL kernel‑level merged‑seckill optimization that replaces traditional queue‑based flash‑sale handling, achieving up to 5.5× higher TPS (up to 23,543 TPS on 128 threads) and sustaining 1.5W+ orders per second, while remaining transparent to applications and preserving compatibility with existing SQL.

CacheLockPerformance Optimization

0 likes · 11 min read

How Merged Seckill Boosts MySQL Write Throughput 5× for High‑Traffic E‑Commerce

Architect's Guide

Jan 16, 2026 · Databases

How to Safely Update Billions of MySQL Rows Without Overloading Binlog

This article explains why a naïve full‑table UPDATE on massive MySQL tables can cripple replication, explores deep‑pagination and IN‑clause inefficiencies, and presents a batch‑processing strategy using NO_CACHE, FORCE INDEX, and rate‑controlled scripts to perform safe, high‑performance updates.

Batch ProcessingBinlogFull Table Update

0 likes · 8 min read

How to Safely Update Billions of MySQL Rows Without Overloading Binlog

AI Insight Log

Jan 15, 2026 · Frontend Development

Vercel Packages a Decade of React Best Practices into an Open‑Source AI Skill

Vercel has open‑sourced a structured "react‑best‑practices" Skill that encodes ten years of React and Next.js performance wisdom, prioritizes eight categories from critical waterfall‑flow elimination to low‑impact tweaks, and equips AI agents to automatically avoid common inefficiencies such as unnecessary bundle bloat and async‑await pitfalls.

AI AgentNext.jsPerformance Optimization

0 likes · 6 min read

Vercel Packages a Decade of React Best Practices into an Open‑Source AI Skill

Big Data Tech Team

Jan 12, 2026 · Big Data

Avoid the 5 Fatal DWS Design Traps and Build Scalable Data Warehouses

This article dissects the five most common pitfalls when transitioning from DWD to DWS aggregation tables—such as chimney‑style designs, over‑wide tables, grain mismatches, missing drill‑down keys, and performance neglect—and offers concrete, production‑ready solutions to create reusable, efficient, and cost‑effective data‑warehouse layers.

DWS DesignData WarehouseETL

0 likes · 9 min read

Avoid the 5 Fatal DWS Design Traps and Build Scalable Data Warehouses

Java Architect Handbook

Jan 10, 2026 · Backend Development

Boost API Speed 14× with a 3‑Level Cache Pyramid in Spring Boot

By combining a local Caffeine cache, a remote Redis layer, and a MySQL database into a three‑tier cache pyramid, this guide shows how to reduce API response time from 28 ms to 2 ms, cut CPU usage by 35 %, and achieve up to 14‑fold performance gains, complete with configuration, code, and monitoring tips.

CacheCaffeinePerformance Optimization

0 likes · 12 min read

Boost API Speed 14× with a 3‑Level Cache Pyramid in Spring Boot

Deepin Linux

Jan 10, 2026 · Operations

Master Linux Process Load Balancing with cgroups and taskset: A Step‑by‑Step Guide

This article explains how to use Linux's built‑in cgroups and taskset tools to monitor, limit, and bind process workloads, providing detailed commands, subsystem explanations, collaborative usage strategies, real‑world case studies, and troubleshooting tips for improving system performance and stability.

LinuxPerformance Optimizationcgroups

0 likes · 29 min read

Master Linux Process Load Balancing with cgroups and taskset: A Step‑by‑Step Guide

Design Hub

Jan 9, 2026 · Artificial Intelligence

LTX‑2 Acceleration Secrets: Boost Speed, Stability, and Visual Quality

This article walks through practical steps to speed up LTX‑2 AI video generation—enabling the NVFP4 model, updating NVIDIA drivers and CUDA, using FP8 text encoders, and applying a custom prompt‑optimizing assistant—showing memory savings, sub‑minute rendering at 1280×720, and noticeable quality gains.

AI video generationFP8LTX-2

0 likes · 11 min read

LTX‑2 Acceleration Secrets: Boost Speed, Stability, and Visual Quality

Ray's Galactic Tech

Jan 8, 2026 · Databases

Boost SQL Performance Without Rewriting Queries: Indexes, Partitioning, Caching

This guide presents a comprehensive, step‑by‑step roadmap for accelerating slow SQL queries without altering the original statements, covering index creation, database parameter tuning, table partitioning, caching layers, read‑write splitting, sharding, statistics updates, hardware choices, and middleware routing.

Database TuningPerformance OptimizationSQL

0 likes · 8 min read

Boost SQL Performance Without Rewriting Queries: Indexes, Partitioning, Caching

Alibaba Cloud Native

Jan 7, 2026 · Cloud Native

How Alibaba Cloud’s One‑Click I/O Diagnosis Tackles Cloud‑Native I/O Bottlenecks

This article explains how Alibaba Cloud CloudMonitor 2.0 integrates SysOM intelligent diagnosis to automatically detect, analyze, and remediate I/O anomalies in multi‑tenant cloud environments, detailing the architecture, dynamic threshold algorithm, anomaly‑trigger logic, and real‑world case studies.

Cloud NativePerformance Optimizationaliyun

0 likes · 13 min read

How Alibaba Cloud’s One‑Click I/O Diagnosis Tackles Cloud‑Native I/O Bottlenecks