Tagged articles
8 articles
Page 1 of 1
Baidu Geek Talk
Baidu Geek Talk
Jul 6, 2022 · Artificial Intelligence

Why Training Massive AI Models Demands New Cluster Architectures and Parallelism Strategies

The article examines the industry trend toward ever‑larger AI models, compares their parameter scale to the human brain, outlines the computational and memory challenges of training such models, and details advanced parallelism techniques and Baidu's high‑performance cluster solutions that enable efficient, stable large‑scale model training.

AI InfrastructureBaiduCluster Computing
0 likes · 17 min read
Why Training Massive AI Models Demands New Cluster Architectures and Parallelism Strategies
ITPUB
ITPUB
Sep 13, 2021 · Big Data

MapReduce vs MPP: Choosing the Right Engine for Global Data Warehousing

A team of engineers at MBI debates the merits of MapReduce, MPP, and Hive for their KeepS global data‑warehouse, discussing technical differences, scalability, concurrency, and the feasibility of mixed batch engines while navigating budget and operational constraints.

Cluster ComputingGrid ComputingHive
0 likes · 20 min read
MapReduce vs MPP: Choosing the Right Engine for Global Data Warehousing
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Nov 5, 2016 · Operations

Distributed vs Cluster: What’s the Real Difference and When to Use Each?

This article explains the core differences between distributed systems and clusters, detailing their architectures, efficiency goals, typical use cases such as Hadoop MapReduce and load‑balancing clusters, and outlines key concepts like scalability, high availability, load balancing, and error recovery.

Cluster ComputingDistributed SystemsHPC
0 likes · 10 min read
Distributed vs Cluster: What’s the Real Difference and When to Use Each?
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Jun 9, 2016 · Fundamentals

Distributed vs Cluster: Key Differences and When to Use Each

This article explains the core distinctions between distributed systems and clusters, covering their architectures, efficiency goals, typical use cases, and examples such as Hadoop MapReduce and load‑balancing clusters, while also detailing cluster types, high‑availability, load balancing, and high‑performance computing.

Cluster ComputingDistributed SystemsHigh‑performance computing
0 likes · 10 min read
Distributed vs Cluster: Key Differences and When to Use Each
Hulu Beijing
Hulu Beijing
Aug 14, 2015 · Big Data

How Voidbox Bridges Docker and YARN for Scalable Big Data Workloads

Voidbox integrates Docker containers with YARN to simplify distributed application development, improve deployment, boost cluster efficiency, and provide fault‑tolerant, DAG‑based execution modes, enabling seamless resource management for Hadoop‑based big data jobs.

Big DataCluster ComputingDAG
0 likes · 17 min read
How Voidbox Bridges Docker and YARN for Scalable Big Data Workloads