Tagged articles
14 articles
Page 1 of 1
Architects' Tech Alliance
Architects' Tech Alliance
Jan 13, 2026 · Artificial Intelligence

Inside Google’s Massive TPU SuperPod: How Scale‑Up and Scale‑Out Build a 9,216‑Chip AI Engine

The article explains Google’s TPU data‑center architecture, detailing the vertical Scale‑Up strategy within a SuperPod, the horizontal Scale‑Out across SuperPods, the 3D Torus topology with Twisted variants, and the multi‑layer network design that enables petabyte‑scale AI training and inference.

AI hardwareData centerScale‑Up
0 likes · 8 min read
Inside Google’s Massive TPU SuperPod: How Scale‑Up and Scale‑Out Build a 9,216‑Chip AI Engine
Architects' Tech Alliance
Architects' Tech Alliance
Jun 29, 2025 · Artificial Intelligence

Scale-Up vs Scale-Out: Balancing Performance and Flexibility in AI Infrastructure

This article explains the technical definitions, core differences, and practical use cases of Scale‑Up and Scale‑Out networking in AI systems, highlighting how they impact latency, bandwidth, and cost, and illustrates their combined application through NVIDIA's NVL72 supernode case study.

AI InfrastructureGPU networkingHigh‑performance computing
0 likes · 14 min read
Scale-Up vs Scale-Out: Balancing Performance and Flexibility in AI Infrastructure
Architects' Tech Alliance
Architects' Tech Alliance
May 31, 2025 · Artificial Intelligence

GPU Cluster Scaling: Understanding Scale‑Up and Scale‑Out for AI Pods

This article explains the concepts of AI Pods and GPU clusters, compares vertical (scale‑up) and horizontal (scale‑out) expansion, describes XPU types, discusses internal and inter‑pod communication, and evaluates the benefits and drawbacks of each scaling approach along with relevant networking technologies.

AI PodsGPUInfiniBand
0 likes · 10 min read
GPU Cluster Scaling: Understanding Scale‑Up and Scale‑Out for AI Pods
NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
Jan 2, 2024 · Backend Development

Cache Design and Optimization Practices for High‑Concurrency Music Library Service

The article details NetEase Cloud Music’s high‑concurrency cache architecture—using lazy‑load, hole‑wrapped objects for penetration protection, placeholder values for missing data, horizontal and vertical scaling with consistent hashing, and asynchronous binlog‑driven invalidation—to achieve sub‑millisecond reads for a read‑heavy, write‑light music library.

Distributed SystemsScale‑Upcache invalidation
0 likes · 12 min read
Cache Design and Optimization Practices for High‑Concurrency Music Library Service
High Availability Architecture
High Availability Architecture
Aug 31, 2021 · Cloud Native

High‑Availability Architecture for etcd in Ant Group’s Massive Kubernetes Clusters

The article describes how Ant Group operates a world‑largest Kubernetes deployment of over 10,000 nodes, details the performance challenges of the etcd key‑value store at such scale, and outlines a comprehensive set of hardware upgrades, configuration tuning, monitoring, data‑splitting, and future distributed‑etcd strategies to achieve robust high‑availability.

etcdperformance tuningscale-out
0 likes · 21 min read
High‑Availability Architecture for etcd in Ant Group’s Massive Kubernetes Clusters
Cloud Native Technology Community
Cloud Native Technology Community
Nov 12, 2019 · Cloud Native

Welcome to the Copy‑Paste Era: Rethinking Cloud‑Native Elasticity

In this keynote from the 2019 Cloud Native Practice Summit, ThoughtWorks China CTO Xu Hao argues that the cloud’s fundamental power is copying images to create identical machines, urging a shift from scale‑up to scale‑out, defining elastic boundaries by business context, and adopting replication‑first thinking for cost‑effective cloud‑native architecture.

Microserviceselasticityindustry insights
0 likes · 7 min read
Welcome to the Copy‑Paste Era: Rethinking Cloud‑Native Elasticity
Efficient Ops
Efficient Ops
Feb 2, 2019 · Operations

Understanding DAS, NAS, and SAN: A Guide to Modern Storage Technologies

This article explains the three main storage architectures—Direct Attached Storage (DAS), Network Attached Storage (NAS), and Storage Area Network (SAN)—and compares their protocols, scaling methods, caching policies, RAID levels, LUN concepts, and core Linux block I/O structures.

DASNASRAID
0 likes · 16 min read
Understanding DAS, NAS, and SAN: A Guide to Modern Storage Technologies
Architects' Tech Alliance
Architects' Tech Alliance
Jan 30, 2019 · Databases

SAP HANA Overview: Deployment Options, Use Cases, Scale‑Up/Scale‑Out, TDI, HA and Architecture

This article provides a comprehensive overview of SAP HANA, covering its role as an in‑memory database, deployment models (cloud, appliance, on‑premise), primary application scenarios, hardware certification, scale‑up versus scale‑out architectures, TDI integration, virtualization support, storage sizing, high‑availability options and node roles.

SAP HANAScale‑UpTDI
0 likes · 12 min read
SAP HANA Overview: Deployment Options, Use Cases, Scale‑Up/Scale‑Out, TDI, HA and Architecture
Architects' Tech Alliance
Architects' Tech Alliance
Jul 5, 2018 · Databases

Understanding SAP HANA Deployment Options, Scenarios, and High‑Availability Strategies

This article explains SAP HANA’s role as an in‑memory database platform, outlines its cloud and on‑premise deployment models, describes key business scenarios such as Business Warehouse on HANA and Business Suite on HANA, and details scale‑up vs. scale‑out, TDI, virtualization, storage sizing, and high‑availability configurations.

DeploymentSAP HANAScale‑Up
0 likes · 10 min read
Understanding SAP HANA Deployment Options, Scenarios, and High‑Availability Strategies
Architects' Tech Alliance
Architects' Tech Alliance
Sep 9, 2016 · Cloud Native

Portworx Container-Defined Storage: Architecture, Principles, and Use Cases

This article explains how Portworx implements container-defined storage with a distributed metadata‑driven block layer, detailing its architecture, control‑plane and data‑plane operations, lifecycle management, integration with orchestration tools, and real‑world scenarios such as big‑data, CMS, and database workloads.

KubernetesPortworxcontainer storage
0 likes · 9 min read
Portworx Container-Defined Storage: Architecture, Principles, and Use Cases
Architects' Tech Alliance
Architects' Tech Alliance
Feb 12, 2016 · Industry Insights

Unlocking Massive Data Deduplication: PBBA Appliances vs Backup Software

Backup environments generate abundant duplicate data, making deduplication essential; this article examines how purpose‑built backup appliances (PBBA) and leading backup software implement variable‑length, global deduplication, compare scale‑out versus scale‑up architectures, and discuss performance trade‑offs and CPU bottlenecks.

BackupPBBAdeduplication
0 likes · 7 min read
Unlocking Massive Data Deduplication: PBBA Appliances vs Backup Software