Tagged articles

parallelism

115 articles · Page 1 of 2

Jun 23, 2026 · Frontend Development

TypeScript 7 RC Rewrites Compiler in Go, Boosting Speed ~10×

Microsoft’s TypeScript 7 RC replaces the compiler core with Go, delivering roughly a ten‑fold speed increase across CLI builds and editor services while remaining fully compatible with TypeScript 6 semantics, and provides detailed upgrade guidance and parallel‑compilation options.

GoTypeScriptUpgrade

0 likes · 8 min read

TypeScript 7 RC Rewrites Compiler in Go, Boosting Speed ~10×

Data Party THU

Jun 15, 2026 · Artificial Intelligence

Beyond Single-Model Limits: How Collaborative Multi-Agent Architecture Drives AI Evolution

The article examines the shortcomings of single-agent AI systems—such as context overload, lack of specialization, and poor scalability—and explains how multi‑agent architectures with coordinated, specialized agents, shared memory, and parallel execution overcome these issues, offering a roadmap for the next generation of AI platforms.

AI ArchitectureAgent communicationMulti-Agent Systems

0 likes · 8 min read

Beyond Single-Model Limits: How Collaborative Multi-Agent Architecture Drives AI Evolution

ITPUB

Jun 1, 2026 · Databases

Why a 2 TB PostgreSQL Instance Still Runs Out of Memory: Hidden work_mem Pitfalls

Even with 2 TB of RAM, PostgreSQL can hit OOM because work_mem is a per‑operation budget that multiplies across sorts, hashes, parallel workers and long‑lived memory contexts, so blindly raising it often hides deeper query‑design and statistics issues.

Memory ManagementOOMPostgreSQL

0 likes · 12 min read

Why a 2 TB PostgreSQL Instance Still Runs Out of Memory: Hidden work_mem Pitfalls

FunTester

May 20, 2026 · Artificial Intelligence

How Anthropic’s Multi‑Agent Orchestration Enables Parallel Workflows

The article explains why a single AI agent hits context and execution limits, describes Anthropic’s multi‑agent orchestration that splits tasks among dedicated sub‑agents coordinated by a controller, discusses model selection, communication, observability, and outlines scenarios where parallel orchestration delivers real benefits.

AI AgentsMultiagentObservability

0 likes · 11 min read

How Anthropic’s Multi‑Agent Orchestration Enables Parallel Workflows

LuTiao Programming

May 3, 2026 · Fundamentals

Stop Mixing Them Up: Real Differences Between Concurrency, Parallelism & Asynchrony

The article clarifies the distinct meanings of concurrency, parallelism and asynchrony, explains their underlying mechanisms with concrete Java examples, highlights common misconceptions, and offers practical guidance on when to apply each model in real‑world systems.

Javaasynchronyconcurrency

0 likes · 8 min read

Stop Mixing Them Up: Real Differences Between Concurrency, Parallelism & Asynchrony

Java Architect Handbook

Apr 20, 2026 · Backend Development

Concurrency vs Parallelism in Java: Definitions, CPU Mechanics, and Interview Tips

The article explains how concurrency differs from parallelism by defining logical versus physical simultaneity, illustrates the concepts with everyday analogies and CPU scheduling details, provides Java code examples, lists common interview follow‑up questions, and offers a concise mnemonic for remembering the distinction.

Backend DevelopmentCPUJava

0 likes · 10 min read

Concurrency vs Parallelism in Java: Definitions, CPU Mechanics, and Interview Tips

AI Engineer Programming

Mar 30, 2026 · Artificial Intelligence

Agent, Multi‑Agent, Deep Agent: Start Simple, Add Complexity Only When Needed

The article clarifies the distinct meanings of Agent, Multi‑Agent, and Deep Agent, explains how control shifts from engineers to models, compares architectures across nine dimensions, and shows why a lightweight harness is essential for long‑running, parallel AI‑driven software development.

AgentDeep AgentHarness

0 likes · 22 min read

Agent, Multi‑Agent, Deep Agent: Start Simple, Add Complexity Only When Needed

TonyBai

Mar 11, 2026 · Fundamentals

What the Go “go func()” You Write Every Day Reveals About Tony Hoare’s 92‑Year Legacy

The article recounts Tony Hoare’s pioneering CSP theory, explains how his rejection of shared‑memory concurrency shaped Go’s goroutine and channel model, and shows why every Go developer’s simple “go func()” and “make(chan int)” embody a half‑century of groundbreaking ideas.

CSPChannelGo

0 likes · 10 min read

What the Go “go func()” You Write Every Day Reveals About Tony Hoare’s 92‑Year Legacy

Code Mala Tang

Feb 24, 2026 · Backend Development

Why Async FastAPI Still Blocks and How to Offload Heavy Work

After fixing unlimited queries and pagination issues, this article reveals why async FastAPI still stalls under load, outlines the hidden bottlenecks in the request lifecycle, and provides practical rules and code examples for offloading heavy work to background workers, ensuring scalability, idempotence, and observability.

FastAPIasyncbackground tasks

0 likes · 9 min read

Why Async FastAPI Still Blocks and How to Offload Heavy Work

Mike Chen's Internet Architecture

Jan 19, 2026 · Fundamentals

Concurrency vs Parallelism vs Asynchrony: Key Differences Explained

This article clarifies the distinct concepts of concurrency, parallelism, and asynchrony, detailing their definitions, implementation mechanisms, resource needs, timing semantics, and ideal use‑cases to help developers choose the right model for high‑performance systems.

Fundamentalsasynchronyconcurrency

0 likes · 5 min read

Concurrency vs Parallelism vs Asynchrony: Key Differences Explained

Tech Verticals & Horizontals

Jan 14, 2026 · Artificial Intelligence

Why Parallelism Matters: Designing Multi‑Agent Architectures for Scalable AI Systems

The article explains why parallelism is crucial for large‑scale AI systems—addressing I/O latency and reliability—by detailing core agent patterns, multi‑agent architectures, reliability strategies, and advanced retrieval‑augmented generation techniques, each illustrated with concrete Jupyter notebooks.

AI GovernanceMulti-Agent SystemsRAG

0 likes · 6 min read

Why Parallelism Matters: Designing Multi‑Agent Architectures for Scalable AI Systems

Baidu Intelligent Cloud Tech Hub

Jan 5, 2026 · Artificial Intelligence

How Baidu Tianchi Supernodes Supercharge Large‑Model Inference: Architecture, Deployment, and Optimization

This article details Baidu's Tianchi supernode design and software tuning—covering hardware scale‑up, deployment planning, Prefill and Decode stage optimizations, quantization strategies, and communication schemes—to dramatically boost large‑model inference throughput and latency while lowering token‑cost.

AI InfrastructurePerformance Optimizationlarge model inference

0 likes · 20 min read

How Baidu Tianchi Supernodes Supercharge Large‑Model Inference: Architecture, Deployment, and Optimization

AI Insight Log

Jan 3, 2026 · Artificial Intelligence

13 Proven Tricks to Double Your AI‑Assisted Coding Efficiency (From Claude Code’s Founder)

Boris Cherny, the founder of Claude Code, reveals a detailed 13‑step workflow that combines aggressive parallelism, Opus 4.5 with Thinking mode, a shared CLAUDE.md knowledge base, custom slash commands, sub‑agents, automated formatting hooks, permission presets, deep tool integrations, and a strict verification loop to dramatically boost AI‑driven development productivity.

AI‑assisted developmentClaude CodeMCP integration

0 likes · 9 min read

13 Proven Tricks to Double Your AI‑Assisted Coding Efficiency (From Claude Code’s Founder)

Lobster Programming

Nov 3, 2025 · Fundamentals

Serial vs Concurrent vs Parallel: Which Model Boosts Your Program’s Performance?

Serial execution runs tasks one after another, concurrency interleaves multiple tasks to appear simultaneous, and parallelism truly runs tasks at the same time on multiple cores, each offering distinct advantages and trade‑offs that affect efficiency, responsiveness, and programming complexity.

CPUparallelismperformance

0 likes · 2 min read

Serial vs Concurrent vs Parallel: Which Model Boosts Your Program’s Performance?

JavaScript

Oct 14, 2025 · Frontend Development

Boost JavaScript Async Performance by Up to 80% with New Promise Techniques

While async/await simplifies JavaScript code, it can introduce significant overhead in high‑frequency or compute‑heavy scenarios; this article introduces alternative async patterns—optimized Promise chaining, parallel Promise.all, batch processing, and pooling—that can reduce context switches and deliver performance gains of up to 80%.

Async/AwaitJavaScriptPerformance Optimization

0 likes · 5 min read

Boost JavaScript Async Performance by Up to 80% with New Promise Techniques

BirdNest Tech Talk

Sep 27, 2025 · Artificial Intelligence

Boost LLM App Speed with LangChain’s RunnableParallel: A Step‑by‑Step Guide

This article explains how LangChain’s RunnableParallel component enables true parallel execution of independent sub‑tasks, walks through concrete Python examples, compares serial versus parallel runtimes, and outlines when and why to apply this pattern for faster, more capable LLM applications.

LLMLangChainPython

0 likes · 6 min read

Boost LLM App Speed with LangChain’s RunnableParallel: A Step‑by‑Step Guide

JavaScript

Sep 16, 2025 · Frontend Development

Boost JavaScript Async Performance: Up to 80% Faster Than async/await

This article explains why async/await can cause performance bottlenecks in JavaScript and introduces optimized Promise‑based techniques—such as chain optimization, Promise.all parallelism, batch processing, and pooling—that can improve async execution speed by up to 80% in specific scenarios.

Async/AwaitJavaScriptPerformance Optimization

0 likes · 4 min read

Boost JavaScript Async Performance: Up to 80% Faster Than async/await

php Courses

Sep 10, 2025 · Fundamentals

Mastering C++11 Concurrency: std::thread, std::async, and Best Practices

This guide explains why modern C++ programs need concurrency, introduces the core C++11 tools std::thread and std::async, demonstrates basic usage, parameterized threads, lambda expressions, async task handling, synchronization with mutexes, exception safety, parallel data processing, and provides best‑practice tips for efficient and safe multithreaded development.

C#parallelismstd::async

0 likes · 10 min read

Mastering C++11 Concurrency: std::thread, std::async, and Best Practices

Sohu Tech Products

Sep 3, 2025 · Fundamentals

Why Does suspend Still Feel Blocking? Unraveling Concurrency in Kotlin Coroutines

This article demystifies the differences between concurrency, asynchrony, blocking and non‑blocking in Kotlin coroutines by providing runnable examples, experiments, and detailed analysis of thread usage, dispatcher behavior, and performance outcomes.

AsynchronousBlockingCoroutines

0 likes · 9 min read

Why Does suspend Still Feel Blocking? Unraveling Concurrency in Kotlin Coroutines

Code Wrench

Aug 12, 2025 · Backend Development

Master Go Concurrency: Goroutine vs Thread, GMP Model, and Practical Patterns

This article explains why Go's goroutines outperform traditional threads, details the GMP scheduling model, shows how to avoid data races with sync primitives, and provides ready-to-use concurrency libraries and patterns with clear code examples.

GMPGoroutineconcurrency

0 likes · 8 min read

Master Go Concurrency: Goroutine vs Thread, GMP Model, and Practical Patterns

Volcano Engine Developer Services

Aug 6, 2025 · Artificial Intelligence

How VeOmni Revolutionizes Multimodal Model Training with 40% Speed Gains

VeOmni, ByteDance’s open‑source unified multimodal training framework, tackles fragmented training pipelines by integrating LoRA fine‑tuning, FSDP, Ulysses, and Expert Parallel, delivering up to 40% higher throughput, up to 55% memory savings, and streamlined one‑click deployment for LLM, VLM, and video models.

AIMultimodalTraining

0 likes · 14 min read

How VeOmni Revolutionizes Multimodal Model Training with 40% Speed Gains

Code Mala Tang

Jul 22, 2025 · Fundamentals

Boost Python Loops: Parallelism, Generators, and Profiling Made Easy

This guide shows how to accelerate slow Python for‑loops by leveraging multi‑core parallelism, memory‑efficient generators, and a suite of profiling tools, providing step‑by‑step code examples and practical tips to identify and fix performance bottlenecks.

Profilinggeneratorsparallelism

0 likes · 16 min read

Boost Python Loops: Parallelism, Generators, and Profiling Made Easy

FunTester

Jul 13, 2025 · Backend Development

Master Go Concurrency: Goroutines, Channels, and Real-World Examples

Learn how Go’s built‑in concurrency model using goroutines and channels can transform sequential code into responsive, high‑performance applications, with clear explanations of concurrency vs parallelism, practical code samples, synchronization techniques, and best practices for building scalable web servers.

ChannelGoroutineWeb Server

0 likes · 10 min read

Master Go Concurrency: Goroutines, Channels, and Real-World Examples

JavaEdge

Jun 27, 2025 · Artificial Intelligence

Why Inference Engines Are Essential for Deploying Large Language Models in Production

The article explains what inference engines are, why they are needed beyond raw Python scripts, and outlines best practices such as model quantization, batching, and parallelism, while comparing popular open‑source and commercial options for production AI workloads.

AI DeploymentBatchingLLM

0 likes · 14 min read

Why Inference Engines Are Essential for Deploying Large Language Models in Production

Architect

May 18, 2025 · Artificial Intelligence

How Much GPU Memory Can One Model Use? A Deep Dive into Transformer Memory Accounting

This article breaks down GPU memory consumption for large Transformer models, explains how to estimate each component—parameters, optimizer state, activations, gradients—and shows how parallelism, mixed precision, and recomputation strategies can dramatically reduce the footprint.

AI trainingGPU memoryMemory optimization

0 likes · 14 min read

How Much GPU Memory Can One Model Use? A Deep Dive into Transformer Memory Accounting

Architect's Tech Stack

May 15, 2025 · Backend Development

Understanding Java Stream API: filter, map, flatMap, and Parallel Operations

This article explains how Java's Stream API enables efficient data processing through pipeline operations such as filter, map, flatMap, stream creation methods, conversion to collections, and parallel execution, providing code examples and practical usage guidelines.

Functional ProgrammingJavaStream API

0 likes · 11 min read

Understanding Java Stream API: filter, map, flatMap, and Parallel Operations

FunTester

Apr 18, 2025 · Backend Development

Using CompletableFuture for Parallel REST Calls in Java

The article explains why serial REST calls cause performance bottlenecks, illustrates the benefits of concurrent requests, and demonstrates how Java 8's CompletableFuture can be used to implement parallel REST calls with robust exception handling, improving throughput and resource utilization.

CompletableFutureJavaREST

0 likes · 10 min read

Using CompletableFuture for Parallel REST Calls in Java

Cognitive Technology Team

Apr 12, 2025 · Backend Development

Using CompletableFuture with Streams for Parallel Execution in Java

The article explains how to correctly combine Java's CompletableFuture with Stream API to achieve true asynchronous parallelism, highlights common pitfalls that lead to sequential execution, and provides the proper pattern of creating a CompletableFuture stream followed by a terminal operation.

CompletableFutureJavaStream

0 likes · 3 min read

Using CompletableFuture with Streams for Parallel Execution in Java

Top Architect

Apr 9, 2025 · Backend Development

Understanding ForkJoinPool: Principles, Usage, and Performance Evaluation in Java

This article explains the Fork/Join model and Java's ForkJoinPool, covering divide‑and‑conquer theory, task types, pool construction, core methods, performance testing, and practical recommendations such as avoiding the commonPool for blocking tasks.

ForkJoinPoolJavaconcurrency

0 likes · 26 min read

Understanding ForkJoinPool: Principles, Usage, and Performance Evaluation in Java

Architecture Development Notes

Mar 16, 2025 · Backend Development

Choosing the Right Concurrency Model: Go vs Python vs Rust

This article compares Go, Python, and Rust concurrency implementations—covering CSP‑based goroutines, GIL constraints, and ownership‑driven thread safety—to help developers select the most suitable model for high‑throughput, CPU‑bound, or safety‑critical applications.

GoPythonasync

0 likes · 9 min read

Choosing the Right Concurrency Model: Go vs Python vs Rust

DataFunSummit

Mar 14, 2025 · Artificial Intelligence

Insights from Zhihu's ZhiLight Large‑Model Inference Framework: Architecture, Parallelism, and Performance Optimizations

The article summarizes Zhihu's machine‑learning platform lead Wang Xin's presentation on the ZhiLight large‑model inference framework, covering model execution mechanisms, GPU workload analysis, pipeline and tensor parallelism, GPU architecture evolution, open‑source engine comparisons, ZhiLight's compute‑communication overlap and quantization optimizations, benchmark results, supported models, and future directions.

GPULLMOpen‑source

0 likes · 13 min read

Insights from Zhihu's ZhiLight Large‑Model Inference Framework: Architecture, Parallelism, and Performance Optimizations

Code Mala Tang

Feb 15, 2025 · Fundamentals

Unlock Full CPU Power in Python: A Hands‑On Guide to Multiprocessing

This article explains why Python’s Global Interpreter Lock limits CPU core usage, introduces the multiprocessing module for parallel execution of CPU‑intensive tasks, and provides step‑by‑step code examples, key concepts, synchronization tools, a real‑world image‑processing case, and best practices to dramatically speed up your programs.

CPU BoundMultiprocessingPython

0 likes · 9 min read

Unlock Full CPU Power in Python: A Hands‑On Guide to Multiprocessing

Python Programming Learning Circle

Jan 15, 2025 · Fundamentals

Communicating Sequential Processes (CSP): Concepts, Implementations, and Python Libraries

This article explains the CSP concurrency model, compares it with the Actor model, discusses its advantages and limitations, and reviews Go's native support as well as several Python libraries and experimental projects that aim to bring CSP-style parallelism to Python.

CSPGoasync

0 likes · 11 min read

Communicating Sequential Processes (CSP): Concepts, Implementations, and Python Libraries

DataFunSummit

Dec 30, 2024 · Artificial Intelligence

Colossal-AI: A Scalable Framework for Distributed Training of Large Models

This presentation introduces the challenges of the large‑model era, describes the Colossal‑AI architecture—including N‑dimensional parallelism, heterogeneous storage, and zero‑code experience—shows benchmark results and real‑world use cases, and answers audience questions about its integration with PyTorch and advanced parallel strategies.

AI InfrastructureColossal-AIHeterogeneous Storage

0 likes · 11 min read

Colossal-AI: A Scalable Framework for Distributed Training of Large Models

AI Large Model Application Practice

Dec 30, 2024 · Artificial Intelligence

Implementing LLM Routing and Parallel Agent Workflows with PydanticAI

This tutorial walks through building semantic routing and parallel execution patterns for LLM agents using the lightweight PydanticAI framework, providing step‑by‑step code, example configurations, and practical observations to help developers create flexible AI‑driven workflows.

LLMPydanticAIPython

0 likes · 11 min read

Implementing LLM Routing and Parallel Agent Workflows with PydanticAI

Big Data Technology & Architecture

Nov 26, 2024 · Big Data

Understanding Full GC, Data Skew, and Parallelism in Flink Tasks

This article explains how to monitor and interpret Full GC in Flink TaskManagers, detect and address data skew through proper data distribution and parallelism settings, and recommends aligning consumer parallelism with Kafka partitions, while also providing practical tips for using tools like Prometheus and Arthas.

Data SkewFlinkTaskManager

0 likes · 6 min read

Understanding Full GC, Data Skew, and Parallelism in Flink Tasks

FunTester

Nov 25, 2024 · Fundamentals

Understanding Concurrency and Parallelism in Java Multithreading

This article introduces the basics of Java multithreading concurrency, explains the difference between concurrency and parallelism with a supermarket analogy, and details thread pool creation, usage, and customization through analysis of ThreadPoolExecutor source code.

JavaThreadPoolExecutormultithreading

0 likes · 9 min read

Understanding Concurrency and Parallelism in Java Multithreading

Kuaishou Large Model

Nov 22, 2024 · Artificial Intelligence

Boost LLM Training on Massive Clusters with DP/TP Overlap and Context Parallelism

This article details a comprehensive set of techniques—including data‑ and tensor‑parallel overlap, context‑parallelism, activation rematerialization, and a performance‑driven cost model—that dramatically improve large‑language‑model training efficiency on ultra‑large GPU clusters while preserving model quality.

Performance Modelingactivation recomputationdistributed training

0 likes · 28 min read

Boost LLM Training on Massive Clusters with DP/TP Overlap and Context Parallelism

Top Architect

Oct 17, 2024 · Backend Development

Understanding ForkJoinPool and the Fork/Join Framework in Java

This article explains the limitations of ThreadPoolExecutor, introduces the Fork/Join model and ForkJoinPool, demonstrates how to implement divide‑and‑conquer tasks with RecursiveTask, analyzes the pool’s design, task submission methods, work‑stealing mechanism, common pool pitfalls, and presents performance evaluation results.

DivideAndConquerForkJoinPoolJava

0 likes · 26 min read

Understanding ForkJoinPool and the Fork/Join Framework in Java

MaGe Linux Operations

Sep 28, 2024 · Backend Development

Master Go Concurrency: Goroutines, Channels, Locks, Timers and Synchronization

This comprehensive guide explains the fundamentals of concurrent programming in Go, covering the differences between parallelism and concurrency, process and thread concepts, and detailed usage of goroutines, channels, select statements, timers, mutexes, read‑write locks, wait groups, once, sync.Map, and atomic operations with practical code examples and diagrams.

ChannelGoroutineatomic

0 likes · 42 min read

Master Go Concurrency: Goroutines, Channels, Locks, Timers and Synchronization

Test Development Learning Exchange

Sep 22, 2024 · Fundamentals

Understanding Concurrency, Parallelism, Synchronization, Asynchronous, Blocking, and Non‑blocking in Python with Code Examples

This article explains the key concepts of concurrency, parallelism, synchronization, asynchronous execution, blocking, and non‑blocking in Python, providing clear explanations and practical code samples for each concept, including API automation examples for HTTP requests.

BlockingNon-blockingPython

0 likes · 14 min read

Understanding Concurrency, Parallelism, Synchronization, Asynchronous, Blocking, and Non‑blocking in Python with Code Examples

Huawei Cloud Developer Alliance

Sep 18, 2024 · Artificial Intelligence

How Distributed Training Powers Massive Language Models: Concepts, Strategies, and Code

This article explains why single‑machine resources are insufficient for training ever‑larger language models, introduces the fundamentals of distributed training systems, details various parallel strategies such as data, model, pipeline, and hybrid parallelism, and provides practical PyTorch code and memory‑optimization techniques to accelerate large‑scale model training.

GPUPyTorchdeep learning

0 likes · 29 min read

How Distributed Training Powers Massive Language Models: Concepts, Strategies, and Code

Architecture and Beyond

Sep 7, 2024 · Backend Development

Six Proven Backend Techniques to Supercharge System Performance

This comprehensive guide walks backend architects through six core optimization methods—caching, batch processing, asynchronous handling, data compression, parallelization, and eliminating unnecessary requests—detailing their problem domains, implementation strategies, real‑world scenarios, benefits, and trade‑offs.

AsynchronousBatch ProcessingCaching

0 likes · 48 min read

Six Proven Backend Techniques to Supercharge System Performance

Python Programming Learning Circle

Sep 3, 2024 · Fundamentals

Simplifying Python Parallelism with map and ThreadPool

This article explains why traditional Python multithreading tutorials are often overly complex, introduces the concise map‑based approach using multiprocessing and multiprocessing.dummy ThreadPool, demonstrates performance gains with real‑world examples, and provides ready‑to‑run code snippets for efficient parallel execution.

Multiprocessingmapparallelism

0 likes · 10 min read

Simplifying Python Parallelism with map and ThreadPool

FunTester

Aug 15, 2024 · Backend Development

9 Proven Techniques to Supercharge Service Performance

This article outlines nine practical methods—caching, parallelization, batch processing, data compression, lock‑free design, sharding, request avoidance, pooling, and asynchronous handling—demonstrating how each can be applied to backend services to dramatically reduce latency and improve throughput.

AsynchronousBatch ProcessingCaching

0 likes · 25 min read

9 Proven Techniques to Supercharge Service Performance

360 Smart Cloud

Jul 17, 2024 · Artificial Intelligence

Parallelism and Memory‑Optimization Techniques for Distributed Large‑Scale Transformer Training

This article reviews the principles and practical implementations of data, pipeline, tensor, sequence, and context parallelism together with memory‑saving strategies such as recomputation and ZeRO, and demonstrates how the QLM framework leverages these techniques to accelerate large‑model training and fine‑tuning on multi‑GPU clusters.

GPUMegatron-LMMemory optimization

0 likes · 18 min read

Parallelism and Memory‑Optimization Techniques for Distributed Large‑Scale Transformer Training

Baobao Algorithm Notes

Jul 11, 2024 · Artificial Intelligence

Why Separate Prefill and Decode? A Deep Dive into DistServe’s Split Inference Architecture

This article explores the two‑stage LLM inference pipeline, introduces TTFT and TPOT metrics, explains the motivation for prefilling‑decoding separation, presents experimental comparisons between split and merged architectures, and details optimization techniques and parallel‑strategy modeling for DistServe.

DistServeGoodputLLM Inference

0 likes · 28 min read

Architect

Jun 26, 2024 · Backend Development

Understanding the Fork/Join Framework and ForkJoinPool in Java

This article explains the limitations of ThreadPoolExecutor, introduces the Fork/Join model and ForkJoinPool, demonstrates how to implement divide‑and‑conquer tasks with RecursiveTask, provides performance benchmarks, and discusses design details, task submission methods, work‑stealing, and cautions about using the common pool.

DivideAndConquerForkJoinPoolJava

0 likes · 23 min read

Understanding the Fork/Join Framework and ForkJoinPool in Java

Open Source Linux

Jun 1, 2024 · Fundamentals

Mastering Concurrency and Parallelism in Java: From Basics to Advanced APIs

This article explains the concepts of concurrency, parallelism, and serial execution, describes common multi‑core scheduling algorithms, and demonstrates Java's concurrent programming tools—including Future, Fork/Join, Stream API, and CompletableFuture—through clear code examples and practical guidelines.

CompletableFutureFutureJava

0 likes · 20 min read

Mastering Concurrency and Parallelism in Java: From Basics to Advanced APIs

Baidu Geek Talk

May 15, 2024 · Artificial Intelligence

Accelerating Large Model Training and Inference with Baidu Baige AIAK‑LLM: Challenges, Techniques, and Optimizations

The talk outlines how Baidu’s Baige AIAK‑LLM suite tackles the exploding compute demands of trillion‑parameter models by boosting Model FLOPS Utilization through advanced parallelism, memory‑saving recompute, zero‑offload, adaptive scheduling, and cross‑chip orchestration, delivering 30‑60% training and inference speedups and a unified cloud product.

AI InfrastructureBaiduInference Optimization

0 likes · 25 min read

Accelerating Large Model Training and Inference with Baidu Baige AIAK‑LLM: Challenges, Techniques, and Optimizations

Baidu Intelligent Cloud Tech Hub

May 15, 2024 · Artificial Intelligence

How Baidu’s AIAK‑LLM Supercharges Large‑Model Training and Inference

The article explains the scaling challenges of ever‑larger LLMs, introduces the MFU performance metric, surveys industry parallelism and memory‑saving techniques, and details Baidu’s AIAK‑LLM suite—including resource, component and acceleration layers—as well as concrete training and inference optimizations that raise MFU by 30‑60% and cut deployment costs.

AI InfrastructureMFUMemory optimization

0 likes · 25 min read

How Baidu’s AIAK‑LLM Supercharges Large‑Model Training and Inference

MaGe Linux Operations

May 2, 2024 · Fundamentals

Unlock Python’s Power: Master Multiprocessing for Faster, Scalable Code

This comprehensive guide explains Python’s multiprocessing module, covering process creation, inter‑process communication, pools, synchronization primitives, error handling, and real‑world examples such as web crawlers, data analysis, and game servers, helping developers harness multiple CPU cores to boost performance and avoid GIL limitations.

Code examplesMultiprocessingPython

0 likes · 32 min read

Unlock Python’s Power: Master Multiprocessing for Faster, Scalable Code

DataFunSummit

Apr 14, 2024 · Artificial Intelligence

TensorRT-LLM: NVIDIA’s Scalable LLM Inference Framework – Overview, Features, Workflow, Performance, and Future Directions

This article presents a comprehensive overview of NVIDIA’s TensorRT-LLM, detailing its product positioning as a scalable LLM inference solution, key features such as model support, low-precision and quantization techniques, parallelism strategies, the end-to-end usage workflow, performance highlights, future roadmap, and answers to common technical questions.

LLM InferenceNVIDIAQuantization

0 likes · 13 min read

TensorRT-LLM: NVIDIA’s Scalable LLM Inference Framework – Overview, Features, Workflow, Performance, and Future Directions

Architect's Guide

Mar 22, 2024 · Backend Development

Understanding ForkJoinPool: Principles, Implementation, and Performance Evaluation in Java

This article explains the Fork/Join model and Java's ForkJoinPool, covering divide‑and‑conquer theory, custom RecursiveTask examples, pool construction options, task submission methods, work‑stealing mechanics, commonPool pitfalls, and performance testing results to help developers decide when to use it.

DivideAndConquerForkJoinPoolJava

0 likes · 22 min read

Understanding ForkJoinPool: Principles, Implementation, and Performance Evaluation in Java

Go Development Architecture Practice

Mar 21, 2024 · Backend Development

How to Process One Billion Rows in Go: 9 Optimized Solutions Under 4 Seconds

This article walks through nine Go‑based implementations for the 1‑Billion‑Row Challenge, starting from a straightforward scanner approach and progressively applying map pointer values, custom parsing, integer arithmetic, buffer tweaks, custom hash tables, and parallelism to shrink processing time from 1 minute 45 seconds to under 4 seconds.

1BRCGobenchmark

0 likes · 22 min read

How to Process One Billion Rows in Go: 9 Optimized Solutions Under 4 Seconds

Big Data Technology & Architecture

Mar 20, 2024 · Big Data

Flink 1.19 New Features: SQL Optimizations, Runtime Enhancements, and Checkpointing Improvements

The article reviews Flink 1.19’s new features, highlighting SQL capability enhancements such as custom source parallelism, TTL hints, and MiniBatch support for regular joins, as well as runtime dynamic parallelism for batch jobs and flexible checkpointing intervals for different data sources.

Big DataFlinkSQL

0 likes · 6 min read

Flink 1.19 New Features: SQL Optimizations, Runtime Enhancements, and Checkpointing Improvements

Bilibili Tech

Mar 15, 2024 · Artificial Intelligence

Hardware Resource Estimation and Bottleneck Analysis for Large Language Models (LLMs)

The article analyzes the compute, memory, and communication resources required to train and run large language models, quantifies bottlenecks such as the massive FLOP demand, terabyte‑scale GPU memory, and high‑bandwidth interconnect needs, and evaluates parallelism strategies and bandwidth estimates to guide hardware and software design for scaling LLMs.

AI InfrastructureHardwareLLM

0 likes · 53 min read

Hardware Resource Estimation and Bottleneck Analysis for Large Language Models (LLMs)

Ops Development & AI Practice

Feb 25, 2024 · Backend Development

Understanding Go's CSP-Based Concurrency: Goroutines and Channels Explained

This article explains Go's CSP-inspired concurrency model, detailing how lightweight Goroutines and typed Channels work together to enable efficient, non‑preemptive parallelism and common patterns like producer‑consumer, pipelines, and worker pools.

CSPChannelGo

0 likes · 4 min read

Understanding Go's CSP-Based Concurrency: Goroutines and Channels Explained

NewBeeNLP

Feb 8, 2024 · Artificial Intelligence

How Speculative Decoding Supercharges Large Language Model Inference

This survey examines speculative decoding—a draft‑then‑verify technique that parallelizes token generation to cut LLM inference latency, outlines its core components, compares independent and self‑drafting methods, discusses verification strategies, and highlights open research challenges.

LLM InferencePerformance Optimizationartificial-intelligence

0 likes · 15 min read

How Speculative Decoding Supercharges Large Language Model Inference

DataFunTalk

Jan 31, 2024 · Artificial Intelligence

Introduction to NVIDIA TensorRT-LLM Inference Framework

TensorRT-LLM is NVIDIA's scalable inference framework for large language models that combines TensorRT compilation, fast kernels, multi‑GPU parallelism, low‑precision quantization, and a PyTorch‑like API to deliver high‑performance LLM serving with extensive customization and future‑focused enhancements.

GPU AccelerationLLM InferenceNVIDIA

0 likes · 12 min read

Introduction to NVIDIA TensorRT-LLM Inference Framework

JD Retail Technology

Dec 19, 2023 · Fundamentals

Overview of CPU Architecture, Performance Trends, and Their Impact on Software Development

This article reviews recent decades of CPU performance improvements and semiconductor process advances, explains current CPU architectures, instruction set evolution, and how these trends influence software development practices, including parallelism, SIMD, multithreading, and power‑efficiency considerations.

CPU architectureInstruction Setmicroarchitecture

0 likes · 42 min read

Overview of CPU Architecture, Performance Trends, and Their Impact on Software Development

DataFunTalk

Dec 6, 2023 · Artificial Intelligence

Distributed Training Techniques and Quantitative Analysis for Large Language Models (GPT‑175B)

This article presents a comprehensive overview of state‑of‑the‑art distributed training methods for large language models, using GPT‑175B as a case study to analyze memory, communication, and compute overheads, and to recommend practical optimization strategies such as tensor, pipeline, and sequence parallelism, ZeRO‑1 optimizer, and selective activation checkpointing.

GPU memory optimizationLLMMegatron

0 likes · 22 min read

Distributed Training Techniques and Quantitative Analysis for Large Language Models (GPT‑175B)

Python Programming Learning Circle

Oct 7, 2023 · Fundamentals

Understanding Python Threads, Processes, GIL, and Multiprocessing

This article explains the fundamental differences between threads and processes, the role of Python's Global Interpreter Lock (GIL), and provides a comprehensive guide to using the multiprocessing module, its components, synchronization primitives, and the concurrent.futures API for parallel execution in Python.

GILMultiprocessingconcurrency

0 likes · 36 min read

Understanding Python Threads, Processes, GIL, and Multiprocessing

Baobao Algorithm Notes

Sep 12, 2023 · Artificial Intelligence

Why RTX 4090 Beats H100 for LLM Inference but Fails at Training

The article analyses the performance, memory, bandwidth and cost of NVIDIA H100, A100 and RTX 4090 GPUs, explains why the 4090 cannot handle large‑model training due to communication and memory limits, and shows how its high compute‑to‑price ratio makes it attractive for inference, backed by detailed parallelism calculations and cost‑per‑token estimates.

GPULLMcost

0 likes · 46 min read

Why RTX 4090 Beats H100 for LLM Inference but Fails at Training

Baidu Intelligent Cloud Tech Hub

Jul 24, 2023 · Artificial Intelligence

How PaddlePaddle Powers Large‑Model Distributed Training: Techniques & Optimizations

This article explains the challenges of training massive AI models and details PaddlePaddle's 4D hybrid parallelism, MoE acceleration, long‑sequence strategies, end‑to‑end performance optimizations, and practical code examples for building and scaling large models efficiently.

AIPaddlePaddledistributed training

0 likes · 12 min read

How PaddlePaddle Powers Large‑Model Distributed Training: Techniques & Optimizations

Cognitive Technology Team

May 20, 2023 · Backend Development

Correct Use of CompletableFuture with Streams for Parallel Execution in Java

The article explains how to properly combine Java's CompletableFuture with Stream APIs to achieve true parallel execution, highlights common pitfalls that prevent concurrency, and provides the correct pattern of splitting streams and applying terminal operations for effective asynchronous processing.

CompletableFutureJavaStream

0 likes · 3 min read

Correct Use of CompletableFuture with Streams for Parallel Execution in Java

Big Data Technology & Architecture

Apr 23, 2023 · Big Data

Spark and Flink Optimization Guide: Parallelism, GC Tuning, Memory Settings, and Production Configurations

This article provides a comprehensive guide on optimizing Spark and Flink workloads, covering parallelism settings, garbage‑collection tuning, out‑of‑memory mitigation, and full production‑grade configuration examples for both frameworks.

Big DataFlinkGC optimization

0 likes · 7 min read

Spark and Flink Optimization Guide: Parallelism, GC Tuning, Memory Settings, and Production Configurations

Selected Java Interview Questions

Apr 11, 2023 · Fundamentals

Understanding the Differences Between Processes and Threads, Concurrency, and Shared Resources

This article explains the concepts of processes and threads, their fundamental differences, how they relate to concurrency and parallelism, and details which resources are private to each thread versus shared across a process, using diagrams and real‑world factory analogies to aid understanding.

operating systemparallelismprocess

0 likes · 13 min read

Understanding the Differences Between Processes and Threads, Concurrency, and Shared Resources

MaGe Linux Operations

Apr 4, 2023 · Backend Development

Boost Python Scripts with map() Parallelism: From Threads to ThreadPools

Python’s traditional multithreading tutorials often overcomplicate simple tasks, but by leveraging the built‑in map() function and the multiprocessing.dummy ThreadPool, developers can dramatically simplify and accelerate I/O‑bound and CPU‑bound scripts, reducing code from dozens of lines to just a few while achieving significant speedups.

ThreadPoolmapparallelism

0 likes · 13 min read

Boost Python Scripts with map() Parallelism: From Threads to ThreadPools

DataFunSummit

Apr 2, 2023 · Artificial Intelligence

Efficient Training of Large Models with the Open‑Source Distributed Framework Easy Parallel Library (EPL)

This article introduces the challenges of scaling deep‑learning model training, explains the design and components of the open‑source Easy Parallel Library (EPL) that unifies data, pipeline, and operator‑split parallelism, and demonstrates its best‑practice results on large‑scale classification, BERT‑large, and massive multimodal models.

EPLLarge‑Scale TrainingZeRO

0 likes · 15 min read

Efficient Training of Large Models with the Open‑Source Distributed Framework Easy Parallel Library (EPL)

Baidu Intelligent Cloud Tech Hub

Feb 23, 2023 · Artificial Intelligence

How Baidu’s Cloud Infrastructure Tackles the Challenges of Training Massive AI Models

This article explains how Baidu's intelligent cloud overcomes the compute and storage walls of large‑scale model training by combining hardware design, network topology, and software optimizations such as pipeline, tensor, and expert parallelism, cost‑model‑driven placement, and future‑proof AI infrastructure evolution.

AI InfrastructureBaidu CloudCost Model

0 likes · 28 min read

How Baidu’s Cloud Infrastructure Tackles the Challenges of Training Massive AI Models

FunTester

Feb 21, 2023 · Backend Development

Mastering Java ForkJoinPool: A Hands‑On Guide to Parallel Task Execution

The article introduces Java's ForkJoinPool for dividing large, compute‑intensive tasks into smaller subtasks, explains its suitability for performance testing scenarios such as high‑throughput QPS/RT data collection, and provides a complete Groovy‑based demo that defines a RecursiveTask, implements the compute method, and runs a sum calculation using a thread pool.

ForkJoinPoolJavaPerformanceTesting

0 likes · 6 min read

Mastering Java ForkJoinPool: A Hands‑On Guide to Parallel Task Execution

Architects' Tech Alliance

Jan 30, 2023 · Operations

Advanced Software Performance Optimization Techniques: From Resource Exhaustion to Parallelism

This article presents a comprehensive guide to software performance optimization, covering low‑level resource exhaustion, horizontal scaling, sharding, lock‑free techniques, and system‑wide strategies, while offering practical examples and references for developers seeking to improve efficiency and scalability.

Resource Managementparallelismscalability

0 likes · 12 min read

Advanced Software Performance Optimization Techniques: From Resource Exhaustion to Parallelism

DataFunSummit

Jan 5, 2023 · Artificial Intelligence

GPU Acceleration Techniques for Large AI Models: Parallelism, Fusion, and Simplification

These notes explain how GPUs address the massive data, serial dependencies, and high computational complexity of modern AI by employing three acceleration strategies—parallelism, operator fusion, and simplification—illustrated with Megatron-LM, MoE models, and practical compression techniques such as quantization, distillation, and pruning.

AIGPUMegatron

0 likes · 16 min read

GPU Acceleration Techniques for Large AI Models: Parallelism, Fusion, and Simplification

Laravel Tech Community

Jan 4, 2023 · Fundamentals

Understanding Processes, Threads, Concurrency, and Process Pools

This article explains the concepts of processes and threads, their differences and interactions, the states of a process, the distinctions between serial, concurrent, and parallel execution, and the purpose and operation of process pools in modern computing environments.

Process Poolparallelismprocess

0 likes · 12 min read

Understanding Processes, Threads, Concurrency, and Process Pools

DataFunTalk

Jan 4, 2023 · Artificial Intelligence

GPU Acceleration Techniques for Large AI Models: Parallelism, Fusion, and Simplification

This article explains how GPUs address the massive data, serial dependencies, and high computational complexity of modern AI by employing three acceleration strategies—parallelism, operator fusion, and simplification—detailing methods such as model, pipeline, and tensor parallelism, Megatron framework, MoE models, and various model compression techniques.

AIGPUMegatron

0 likes · 17 min read

Laravel Tech Community

Nov 16, 2022 · Databases

DuckDB New Release Highlights and Feature Changes

The article introduces DuckDB, a high‑performance embedded analytical database, outlines its new release’s storage, performance, and memory improvements, describes its C/C++ integration and build process, and lists key feature changes such as parallel execution, novel compression methods, and enhanced SQL capabilities.

DuckDBEmbedded DatabaseSQL

0 likes · 3 min read

DuckDB New Release Highlights and Feature Changes

Programmer DD

Oct 8, 2022 · Fundamentals

Eight Timeless Computer Architecture Principles Every Designer Should Know

This article outlines eight enduring ideas—from designing for Moore's Law and using abstraction to speeding up common cases, leveraging parallelism, pipelining, prediction, memory hierarchy, and redundancy—that have shaped computer architecture over the past six decades.

CachingComputer ArchitectureMoore's Law

0 likes · 11 min read

Eight Timeless Computer Architecture Principles Every Designer Should Know

Architect's Tech Stack

Aug 3, 2022 · Backend Development

Understanding Java Stream API: filter, map, flatMap, and Parallel Operations

This article introduces Java's Stream API, explaining how filter, map, flatMap, and other intermediate operations work, provides practical code examples for each, and demonstrates stream creation, conversion, and parallel processing to efficiently handle collections and large data sets.

Backend DevelopmentFunctional ProgrammingJava

0 likes · 9 min read

Alibaba Cloud Big Data AI Platform

Jul 12, 2022 · Artificial Intelligence

How Whale Enables Efficient Giant Model Training on Heterogeneous GPUs

The article introduces Whale, an open‑source distributed training framework that unifies multiple parallelism strategies, uses hardware‑aware load balancing to accelerate giant models like BERT‑Large and the trillion‑parameter M6 on heterogeneous GPU clusters, and details its architecture, planning, and real‑world performance gains.

deep learninghardware-aware schedulingheterogeneous GPUs

0 likes · 11 min read

How Whale Enables Efficient Giant Model Training on Heterogeneous GPUs

Baidu Geek Talk

Jul 6, 2022 · Artificial Intelligence

Why Training Massive AI Models Demands New Cluster Architectures and Parallelism Strategies

The article examines the industry trend toward ever‑larger AI models, compares their parameter scale to the human brain, outlines the computational and memory challenges of training such models, and details advanced parallelism techniques and Baidu's high‑performance cluster solutions that enable efficient, stable large‑scale model training.

AI InfrastructureBaiduCluster Computing

0 likes · 17 min read

Why Training Massive AI Models Demands New Cluster Architectures and Parallelism Strategies

MaGe Linux Operations

Jun 10, 2022 · Fundamentals

Why Traditional Python Threading Tutorials Fail and How a Simple map Boosts Speed

This article critiques heavyweight Python threading tutorials, explains why the built‑in map function combined with multiprocessing or multiprocessing.dummy offers a concise, efficient way to parallelize I/O‑ and CPU‑bound tasks, and demonstrates dramatic speed‑ups with real code examples.

MultiprocessingPythonmap function

0 likes · 13 min read

Why Traditional Python Threading Tutorials Fail and How a Simple map Boosts Speed

Tencent Cloud Developer

Jun 2, 2022 · Fundamentals

A Detailed Explanation of Asynchronous Programming

The article explains asynchronous programming by contrasting concurrency, parallelism, and synchronization, illustrates how splitting serial work into independent async tasks can improve performance but introduces resource, locking, and state‑tracking challenges, and offers strategies such as careful task limits, locking, queues, and result monitoring.

Programming Conceptsasynchronous programmingconcurrency

0 likes · 23 min read

A Detailed Explanation of Asynchronous Programming

Java Backend Technology

May 17, 2022 · Backend Development

Cutting a 5‑Second Java Service to <1s with Compression, Parallelism & Caching

This article details how a Java Spring Boot microservice's response time was reduced from 5‑6 seconds to under one second by applying gzip compression, parallel data fetching, short‑lived caching, MySQL index tuning, and JVM G1 garbage‑collector adjustments.

JVMJavaMicroservices

0 likes · 16 min read

Cutting a 5‑Second Java Service to <1s with Compression, Parallelism & Caching

Code Ape Tech Column

Apr 25, 2022 · Backend Development

Deep Dive into Java ForkJoinPool: Design, Implementation, and Usage

This article explains the divide‑and‑conquer principle, the internal design of Java's ForkJoinPool, its core classes (ForkJoinTask, ForkJoinWorkerThread, WorkQueue), key methods for task submission, work stealing, thread management, and provides practical code examples to illustrate how to implement and use fork/join parallelism effectively.

ForkJoinPoolJavaWorkStealing

0 likes · 48 min read

Deep Dive into Java ForkJoinPool: Design, Implementation, and Usage

Architecture Digest

Jan 24, 2022 · Fundamentals

Understanding Python Threads, Processes, GIL, and the multiprocessing & concurrent.futures Modules

This article explains the fundamental differences between threads and processes, the role of Python's Global Interpreter Lock, and provides a comprehensive guide to using the multiprocessing and concurrent.futures modules—including their main classes, synchronization primitives, and practical code examples—for effective concurrent programming in Python.

GILMultiprocessingPython

0 likes · 40 min read

Understanding Python Threads, Processes, GIL, and the multiprocessing & concurrent.futures Modules

ByteDance Terminal Technology

Dec 28, 2021 · Mobile Development

Analyzing Gradle’s Scheduling Mechanism to Optimize Android Component Publishing

This article investigates why large Android projects experience extremely slow AAR publishing, reveals that memory is not the main bottleneck, examines Gradle’s core scheduling, Worker API, lock contention, and measurement inaccuracies, and proposes disabling Worker API to achieve up to fifteen‑fold build speed improvements.

AndroidBuild PerformanceTask scheduling

0 likes · 20 min read

Analyzing Gradle’s Scheduling Mechanism to Optimize Android Component Publishing

Code Ape Tech Column

Dec 8, 2021 · Backend Development

Understanding Java 8 Stream API, Parallel Streams, and ForkJoinPool

This article explains Java 8 Stream API fundamentals, its composition, pipelining and internal iteration, details parallel stream execution using ForkJoinPool, discusses performance considerations, and provides practical code examples for creating and managing streams in backend development.

ForkJoinPoolJavaStream API

0 likes · 19 min read

Understanding Java 8 Stream API, Parallel Streams, and ForkJoinPool

Architects' Tech Alliance

Jul 19, 2021 · Fundamentals

Understanding Processes and Threads: Definitions, Differences, Advantages, and Practical Usage

This article explains the fundamental concepts of processes and threads in operating systems, compares their characteristics, outlines their respective advantages and disadvantages, and provides practical guidelines for choosing between multi‑process and multi‑thread designs in real‑world applications.

concurrencyoperating systemparallelism

0 likes · 20 min read

Understanding Processes and Threads: Definitions, Differences, Advantages, and Practical Usage

Fulu Network R&D Team

Jul 6, 2021 · Operations

Understanding Throughput, Concurrency, and Lock Contention in System Design

Throughput measures the rate at which an application processes tasks, distinct from concurrency, and can be improved by reducing task latency, increasing parallelism, and optimizing lock usage through finer granularity, lower cost, and techniques like buffering, merging, and batch processing to mitigate contention and enhance scalability.

LocksThroughputparallelism

0 likes · 11 min read

Understanding Throughput, Concurrency, and Lock Contention in System Design

Big Data Technology & Architecture

Apr 4, 2021 · Big Data

Flink Performance Tuning Guide: Memory Configuration, Parallelism, Checkpoint Optimization, and Common Issues

This guide details comprehensive Flink performance tuning techniques, covering memory configuration, GC settings, parallelism adjustments, process parameters, partitioning strategies, Netty network tuning, checkpoint optimization, and common issues such as data skew and resource bottlenecks.

CheckpointFlinkMemory Management

0 likes · 18 min read

Flink Performance Tuning Guide: Memory Configuration, Parallelism, Checkpoint Optimization, and Common Issues

Python Programming Learning Circle

Apr 2, 2021 · Fundamentals

Effective Python Parallelism with Thread Pools and the map() Function

This article critiques traditional Python threading tutorials and demonstrates how to replace verbose thread‑pool code with concise map‑based parallelism using multiprocessing and multiprocessing.dummy, providing practical examples, performance measurements, and guidelines for choosing pool sizes for I/O‑ and CPU‑bound tasks.

Multiprocessingconcurrencymap

0 likes · 11 min read

Effective Python Parallelism with Thread Pools and the map() Function

Aikesheng Open Source Community

Mar 11, 2021 · Databases

Optimizing XtraBackup Parameters for Faster MySQL Physical Backups

This article explains XtraBackup's backup workflow, introduces the --parallel and --compress-threads options, and presents performance test results showing how these parameters can significantly reduce backup time for large MySQL databases.

Backup Optimizationmysqlparallelism

0 likes · 6 min read

Optimizing XtraBackup Parameters for Faster MySQL Physical Backups

Python Crawling & Data Mining

Mar 7, 2021 · Fundamentals

Unlocking System Performance: How Amdahl’s Law and Parallelism Shape Modern Computing

This article explains how computer systems combine hardware and system software, describes the memory hierarchy, OS abstractions, Amdahl's law, and the three levels of parallelism—thread‑level, instruction‑level, and SIMD—showing why understanding these concepts is essential for writing fast, reliable programs.

Amdahl's LawComputer ArchitectureMemory Hierarchy

0 likes · 16 min read

Unlocking System Performance: How Amdahl’s Law and Parallelism Shape Modern Computing

Architects Research Society

Sep 2, 2020 · Databases

Scaling PostgreSQL for Multi‑Terabyte Databases: Indexes, Partitioning, Tablespaces, Parallelism, and Replication

This article explains how to extract maximum performance and scalability from PostgreSQL for multi‑terabyte workloads by leveraging specialized indexes, declarative partitioning, tablespaces, parallel query execution, read‑only replica load‑balancing, and foreign‑table sharding techniques.

IndexesPostgreSQLTablespaces

0 likes · 10 min read

Scaling PostgreSQL for Multi‑Terabyte Databases: Indexes, Partitioning, Tablespaces, Parallelism, and Replication

MaGe Linux Operations

Aug 17, 2020 · Fundamentals

Boost Python Performance: Simple Parallelism with map and ThreadPool

This article explains why traditional Python threading tutorials are often over‑engineered, introduces the concise map‑based parallelism using multiprocessing and multiprocessing.dummy, and demonstrates how a few lines of code can dramatically speed up I/O‑bound and CPU‑bound tasks.

MultiprocessingThreadPoolconcurrency

0 likes · 11 min read

Boost Python Performance: Simple Parallelism with map and ThreadPool

Architect

May 21, 2020 · Big Data

Parallel Execution of Multiple Spark Jobs to Optimize Resource Utilization and Reduce Parquet File Count

This article examines how to run several Spark jobs concurrently on a shared SparkContext, balancing full CPU‑vcore utilization with the need to generate fewer Parquet files, and presents practical experiments, scheduling strategies, and performance results.

Big DataJob SchedulingParquet

0 likes · 12 min read

Parallel Execution of Multiple Spark Jobs to Optimize Resource Utilization and Reduce Parquet File Count

Java Backend Technology

Feb 22, 2020 · Backend Development

How Fast Is Java Stream API? Real-World Performance Benchmarks and Insights

This article presents a thorough performance comparison of Java Stream API versus traditional for-loop iteration across simple, object, and complex reduction tasks, revealing when serial or parallel streams excel and offering practical recommendations for developers.

JVMJavaStream API

0 likes · 8 min read

How Fast Is Java Stream API? Real-World Performance Benchmarks and Insights

macrozheng

Feb 13, 2020 · Backend Development

How Fast Is Java Stream API? In‑Depth Performance Benchmarks Revealed

This article presents a comprehensive benchmark of Java's Stream API, comparing its serial and parallel performance against traditional loops across primitive, object, and reduction operations, and offers practical recommendations based on multi‑core versus single‑core results.

JavaStream APIbenchmark

0 likes · 9 min read

How Fast Is Java Stream API? In‑Depth Performance Benchmarks Revealed

Programmer DD

Nov 15, 2019 · Fundamentals

Why Concurrency Isn’t the Same as Parallelism: A Simple Analogy

This article explains the subtle difference between concurrency and parallelism using a ground‑hog and cart analogy, shows how task decomposition creates concurrent pipelines, and maps the model to scalable web‑service architecture, referencing Rob Pike’s talk “Concurrency is not Parallelism”.

GoProgramming FundamentalsWeb services

0 likes · 7 min read

Why Concurrency Isn’t the Same as Parallelism: A Simple Analogy