Author

Past Memory Big Data

A popular big-data architecture channel with over 100,000 developers. Publishes articles on Spark, Hadoop, Flink, Kafka and more. Visit the Past Memory Big Data blog at https://www.iteblog.com. Search "Past Memory" on Google or Baidu.

Articles

Likes

129

Views

Comments

Latest from Past Memory Big Data

59 recent articles

Past Memory Big Data

Feb 25, 2026 · Artificial Intelligence

How Google’s TPU Systolic Array Powered AlphaGo and Large Language Models

Google’s Tensor Processing Unit (TPU) uses a systolic array architecture and low‑precision quantization to overcome the Von Neumann bottleneck, delivering orders‑of‑magnitude higher throughput and energy efficiency for matrix‑multiplication‑heavy AI workloads—from AlphaGo’s inference to today’s massive language models.

AI hardwareGoogleSystolic Array

0 likes · 15 min read

How Google’s TPU Systolic Array Powered AlphaGo and Large Language Models

Past Memory Big Data

Jan 4, 2026 · Industry Insights

Upgrade Your Stack: 2025 Apache Top-Level Projects You Should Know

The article reviews the eleven Apache projects graduating to top-level status in 2025, explaining how each—ranging from big‑data shuffle services and unified data processing to dev‑ops analytics, web frameworks, and messaging platforms—addresses specific infrastructure challenges and why they merit inclusion in modern technology stacks.

Data InfrastructureDevOpsWeb Framework

0 likes · 11 min read

Upgrade Your Stack: 2025 Apache Top-Level Projects You Should Know

Past Memory Big Data

Dec 31, 2025 · Industry Insights

NVIDIA Data‑Center GPU Evolution: V100 to B300 – A Programmer’s Selection Guide

The article maps the evolution of NVIDIA’s data‑center GPUs—from the Volta‑based V100 through Ampere A100, Hopper H100, specialized A800/H800/H20, up to the Blackwell B200/B300—detailing architectures, memory, interconnect, performance trade‑offs, and offers a decision framework for programmers to match each model to specific AI workloads, budgets and regulatory constraints.

AIGPUGPU Selection

0 likes · 11 min read

NVIDIA Data‑Center GPU Evolution: V100 to B300 – A Programmer’s Selection Guide

Past Memory Big Data

Dec 29, 2025 · Industry Insights

How Chinese Open‑Source Projects Dominated Half of 2025 Apache Top‑Level Projects

In 2025, five Apache Top‑Level Projects with Chinese origins—Uniffle, StreamPark, Gravitino, DevLake and HertzBeat—emerged, illustrating a shift toward central, platform‑oriented solutions driven by growing system scale, engineering complexity, and collaborative costs rather than a deliberate national agenda.

Big DataCloud NativeTop-Level Projects

0 likes · 7 min read

How Chinese Open‑Source Projects Dominated Half of 2025 Apache Top‑Level Projects

Past Memory Big Data

Dec 12, 2025 · Big Data

How Uber Reduced Data Freshness from Hours to Minutes Using Flink Streaming

Uber rebuilt its data‑lake ingestion pipeline with Apache Flink, replacing batch jobs with a streaming architecture that cuts data freshness from hours to minutes, lowers compute usage by 25%, and solves challenges like small‑file proliferation, partition skew, and checkpoint‑commit synchronization at petabyte scale.

Apache FlinkApache HudiData Freshness

0 likes · 10 min read

How Uber Reduced Data Freshness from Hours to Minutes Using Flink Streaming

Past Memory Big Data

Dec 9, 2025 · Artificial Intelligence

A Decade of Evolution: Inside Pinterest’s AI Platform Journey

Over ten years Pinterest transformed a fragmented machine‑learning stack into a unified AI platform, iterating through stages from early ad‑hoc pipelines to scalable GPU‑accelerated services, while learning that timing, organization alignment, and efficiency are crucial for lasting impact.

AI platformGPU inferenceML Ops

0 likes · 25 min read

A Decade of Evolution: Inside Pinterest’s AI Platform Journey

Past Memory Big Data

Dec 4, 2025 · Artificial Intelligence

Text2SQL Showdown: Which Technical Path Delivers Higher Accuracy and Lower Cost?

The article analyzes two contrasting Text2SQL architectures—LLM + RAG + DSL versus rule‑driven NLQ—examining their accuracy under controlled conditions, implementation costs, complex query support, and real‑world suitability for enterprise BI, and concludes which approach is more reliable and cost‑effective.

AI+RulesBusiness IntelligenceLLM

0 likes · 16 min read

Text2SQL Showdown: Which Technical Path Delivers Higher Accuracy and Lower Cost?

Past Memory Big Data

Dec 1, 2025 · Big Data

Apache XTable: A Universal Translator for Data Lake Format Interoperability

Apache XTable introduces a lightweight metadata translation layer that decouples data storage from format metadata, enabling zero‑copy, omni‑directional conversion among Hudi, Iceberg, and Delta Lake, allowing organizations to write with one format and read with any engine without duplicating Parquet files.

Apache XTableData LakeDelta Lake

0 likes · 7 min read

Apache XTable: A Universal Translator for Data Lake Format Interoperability

Past Memory Big Data

Nov 12, 2025 · Big Data

How Uber Upgraded Over 2 Million Spark Jobs from 2.4 to 3.3

Uber migrated more than two million daily Spark applications from version 2.4 to 3.3, detailing the motivations, architecture, four-step migration process, custom tools like Polyglot Piranha and Iron Dome, and the resulting performance, cost, and productivity gains.

Apache SparkIron DomeKubernetes

0 likes · 11 min read

How Uber Upgraded Over 2 Million Spark Jobs from 2.4 to 3.3

Past Memory Big Data

Jul 30, 2025 · Big Data

Why Iceberg Is Dropping Positional Deletes in Merge‑on‑Read Tables

The article explains how Apache Iceberg v3 replaces the scalable‑limited positional‑delete mechanism in Merge‑on‑Read tables with compact Deletion Vectors, detailing the performance, I/O and metadata drawbacks of positional deletes and showing how the new bitmap‑based approach resolves them.

Apache IcebergData LakeDeletion Vector

0 likes · 20 min read

Why Iceberg Is Dropping Positional Deletes in Merge‑on‑Read Tables