Tag

Parallelism

0 views collected around this technical thread.

Architect's Tech Stack
Architect's Tech Stack
May 15, 2025 · Backend Development

Understanding Java Stream API: filter, map, flatMap, and Parallel Operations

This article explains how Java's Stream API enables efficient data processing through pipeline operations such as filter, map, flatMap, stream creation methods, conversion to collections, and parallel execution, providing code examples and practical usage guidelines.

Functional ProgrammingJavaParallelism
0 likes · 11 min read
Understanding Java Stream API: filter, map, flatMap, and Parallel Operations
FunTester
FunTester
Apr 18, 2025 · Backend Development

Using CompletableFuture for Parallel REST Calls in Java

The article explains why serial REST calls cause performance bottlenecks, illustrates the benefits of concurrent requests, and demonstrates how Java 8's CompletableFuture can be used to implement parallel REST calls with robust exception handling, improving throughput and resource utilization.

CompletableFutureConcurrencyJava
0 likes · 10 min read
Using CompletableFuture for Parallel REST Calls in Java
Cognitive Technology Team
Cognitive Technology Team
Apr 12, 2025 · Backend Development

Using CompletableFuture with Streams for Parallel Execution in Java

The article explains how to correctly combine Java's CompletableFuture with Stream API to achieve true asynchronous parallelism, highlights common pitfalls that lead to sequential execution, and provides the proper pattern of creating a CompletableFuture stream followed by a terminal operation.

CompletableFutureConcurrencyJava
0 likes · 3 min read
Using CompletableFuture with Streams for Parallel Execution in Java
Top Architect
Top Architect
Apr 9, 2025 · Backend Development

Understanding ForkJoinPool: Principles, Usage, and Performance Evaluation in Java

This article explains the Fork/Join model and Java's ForkJoinPool, covering divide‑and‑conquer theory, task types, pool construction, core methods, performance testing, and practical recommendations such as avoiding the commonPool for blocking tasks.

ConcurrencyForkJoinPoolJava
0 likes · 26 min read
Understanding ForkJoinPool: Principles, Usage, and Performance Evaluation in Java
Architecture Development Notes
Architecture Development Notes
Mar 16, 2025 · Backend Development

Choosing the Right Concurrency Model: Go vs Python vs Rust

This article compares Go, Python, and Rust concurrency implementations—covering CSP‑based goroutines, GIL constraints, and ownership‑driven thread safety—to help developers select the most suitable model for high‑throughput, CPU‑bound, or safety‑critical applications.

AsyncConcurrencyGo
0 likes · 9 min read
Choosing the Right Concurrency Model: Go vs Python vs Rust
DataFunSummit
DataFunSummit
Mar 14, 2025 · Artificial Intelligence

Insights from Zhihu's ZhiLight Large‑Model Inference Framework: Architecture, Parallelism, and Performance Optimizations

The article summarizes Zhihu's machine‑learning platform lead Wang Xin's presentation on the ZhiLight large‑model inference framework, covering model execution mechanisms, GPU workload analysis, pipeline and tensor parallelism, GPU architecture evolution, open‑source engine comparisons, ZhiLight's compute‑communication overlap and quantization optimizations, benchmark results, supported models, and future directions.

GPUInferenceLLM
0 likes · 13 min read
Insights from Zhihu's ZhiLight Large‑Model Inference Framework: Architecture, Parallelism, and Performance Optimizations
Code Mala Tang
Code Mala Tang
Feb 15, 2025 · Fundamentals

Unlock Full CPU Power in Python: A Hands‑On Guide to Multiprocessing

This article explains why Python’s Global Interpreter Lock limits CPU core usage, introduces the multiprocessing module for parallel execution of CPU‑intensive tasks, and provides step‑by‑step code examples, key concepts, synchronization tools, a real‑world image‑processing case, and best practices to dramatically speed up your programs.

CPU-boundConcurrencyMultiprocessing
0 likes · 9 min read
Unlock Full CPU Power in Python: A Hands‑On Guide to Multiprocessing
Python Programming Learning Circle
Python Programming Learning Circle
Jan 15, 2025 · Fundamentals

Communicating Sequential Processes (CSP): Concepts, Implementations, and Python Libraries

This article explains the CSP concurrency model, compares it with the Actor model, discusses its advantages and limitations, and reviews Go's native support as well as several Python libraries and experimental projects that aim to bring CSP-style parallelism to Python.

AsyncCSPConcurrency
0 likes · 11 min read
Communicating Sequential Processes (CSP): Concepts, Implementations, and Python Libraries
DataFunSummit
DataFunSummit
Dec 30, 2024 · Artificial Intelligence

Colossal-AI: A Scalable Framework for Distributed Training of Large Models

This presentation introduces the challenges of the large‑model era, describes the Colossal‑AI architecture—including N‑dimensional parallelism, heterogeneous storage, and zero‑code experience—shows benchmark results and real‑world use cases, and answers audience questions about its integration with PyTorch and advanced parallel strategies.

AI infrastructureColossal-AILarge Models
0 likes · 11 min read
Colossal-AI: A Scalable Framework for Distributed Training of Large Models
Architecture Development Notes
Architecture Development Notes
Dec 11, 2024 · Fundamentals

Master Rust Multithreading: Real-World Examples and Interactive Quizzes

Explore Rust's powerful multithreading model, covering core concepts like ownership, channels, Mutex and Arc, with practical examples for web servers, game development, data processing, and scientific simulations, plus interactive quizzes to reinforce your understanding.

ConcurrencyMultithreadingParallelism
0 likes · 9 min read
Master Rust Multithreading: Real-World Examples and Interactive Quizzes
FunTester
FunTester
Nov 25, 2024 · Fundamentals

Understanding Concurrency and Parallelism in Java Multithreading

This article introduces the basics of Java multithreading concurrency, explains the difference between concurrency and parallelism with a supermarket analogy, and details thread pool creation, usage, and customization through analysis of ThreadPoolExecutor source code.

ConcurrencyJavaMultithreading
0 likes · 9 min read
Understanding Concurrency and Parallelism in Java Multithreading
Kuaishou Large Model
Kuaishou Large Model
Nov 22, 2024 · Artificial Intelligence

Boost LLM Training on Massive Clusters with DP/TP Overlap and Context Parallelism

This article details a comprehensive set of techniques—including data‑ and tensor‑parallel overlap, context‑parallelism, activation rematerialization, and a performance‑driven cost model—that dramatically improve large‑language‑model training efficiency on ultra‑large GPU clusters while preserving model quality.

Parallelismactivation recomputationdistributed training
0 likes · 28 min read
Boost LLM Training on Massive Clusters with DP/TP Overlap and Context Parallelism
Architecture Development Notes
Architecture Development Notes
Nov 22, 2024 · Fundamentals

Master Rust Thread Pools: Build a Custom Concurrent Executor

This guide explains the fundamentals of thread pools, how task scheduling works, and provides a step‑by‑step tutorial for building a custom, efficient thread‑pool implementation in Rust, complete with code examples and an exercise to test concurrent task execution.

ConcurrencyCustom ExecutorParallelism
0 likes · 7 min read
Master Rust Thread Pools: Build a Custom Concurrent Executor
Python Programming Learning Circle
Python Programming Learning Circle
Nov 20, 2024 · Fundamentals

Using Python multiprocessing to Accelerate CPU‑bound Tasks

This article explains how to use Python's multiprocessing module—including Process, Pool, map, apply_async, and shared data primitives—to parallelize CPU‑intensive work, improve performance, and demonstrates a real‑world example of parallel image downloading.

ConcurrencyMultiprocessingParallelism
0 likes · 8 min read
Using Python multiprocessing to Accelerate CPU‑bound Tasks
Top Architect
Top Architect
Oct 17, 2024 · Backend Development

Understanding ForkJoinPool and the Fork/Join Framework in Java

This article explains the limitations of ThreadPoolExecutor, introduces the Fork/Join model and ForkJoinPool, demonstrates how to implement divide‑and‑conquer tasks with RecursiveTask, analyzes the pool’s design, task submission methods, work‑stealing mechanism, common pool pitfalls, and presents performance evaluation results.

ConcurrencyDivideAndConquerForkJoinPool
0 likes · 26 min read
Understanding ForkJoinPool and the Fork/Join Framework in Java
Test Development Learning Exchange
Test Development Learning Exchange
Sep 22, 2024 · Fundamentals

Understanding Concurrency, Parallelism, Synchronization, Asynchronous, Blocking, and Non‑blocking in Python with Code Examples

This article explains the key concepts of concurrency, parallelism, synchronization, asynchronous execution, blocking, and non‑blocking in Python, providing clear explanations and practical code samples for each concept, including API automation examples for HTTP requests.

ConcurrencyParallelismSynchronization
0 likes · 14 min read
Understanding Concurrency, Parallelism, Synchronization, Asynchronous, Blocking, and Non‑blocking in Python with Code Examples
Python Programming Learning Circle
Python Programming Learning Circle
Sep 3, 2024 · Fundamentals

Simplifying Python Parallelism with map and ThreadPool

This article explains why traditional Python multithreading tutorials are often overly complex, introduces the concise map‑based approach using multiprocessing and multiprocessing.dummy ThreadPool, demonstrates performance gains with real‑world examples, and provides ready‑to‑run code snippets for efficient parallel execution.

MultiprocessingParallelismPerformance
0 likes · 10 min read
Simplifying Python Parallelism with map and ThreadPool
360 Smart Cloud
360 Smart Cloud
Jul 17, 2024 · Artificial Intelligence

Parallelism and Memory‑Optimization Techniques for Distributed Large‑Scale Transformer Training

This article reviews the principles and practical implementations of data, pipeline, tensor, sequence, and context parallelism together with memory‑saving strategies such as recomputation and ZeRO, and demonstrates how the QLM framework leverages these techniques to accelerate large‑model training and fine‑tuning on multi‑GPU clusters.

GPUMegatron-LMMemory Optimization
0 likes · 18 min read
Parallelism and Memory‑Optimization Techniques for Distributed Large‑Scale Transformer Training
Architect
Architect
Jun 26, 2024 · Backend Development

Understanding the Fork/Join Framework and ForkJoinPool in Java

This article explains the limitations of ThreadPoolExecutor, introduces the Fork/Join model and ForkJoinPool, demonstrates how to implement divide‑and‑conquer tasks with RecursiveTask, provides performance benchmarks, and discusses design details, task submission methods, work‑stealing, and cautions about using the common pool.

ConcurrencyDivideAndConquerForkJoinPool
0 likes · 23 min read
Understanding the Fork/Join Framework and ForkJoinPool in Java
Baidu Geek Talk
Baidu Geek Talk
May 15, 2024 · Artificial Intelligence

Accelerating Large Model Training and Inference with Baidu Baige AIAK‑LLM: Challenges, Techniques, and Optimizations

The talk outlines how Baidu’s Baige AIAK‑LLM suite tackles the exploding compute demands of trillion‑parameter models by boosting Model FLOPS Utilization through advanced parallelism, memory‑saving recompute, zero‑offload, adaptive scheduling, and cross‑chip orchestration, delivering 30‑60% training and inference speedups and a unified cloud product.

AI infrastructureBaiduMFU
0 likes · 25 min read
Accelerating Large Model Training and Inference with Baidu Baige AIAK‑LLM: Challenges, Techniques, and Optimizations