Author

Past Memory Big Data

A popular big-data architecture channel with over 100,000 developers. Publishes articles on Spark, Hadoop, Flink, Kafka and more. Visit the Past Memory Big Data blog at https://www.iteblog.com. Search "Past Memory" on Google or Baidu.

Articles

Likes

129

Views

Comments

Latest from Past Memory Big Data

59 recent articles

Past Memory Big Data

Jun 22, 2026 · Big Data

What’s New in Apache Spark 4.2? Core Features and Architecture Evolution

Apache Spark 4.2 introduces a lightweight Spark Connect architecture, native AI integration, enhanced Metrics View for unified semantics, Arrow‑first performance gains, advanced SQL extensions like vector search and QUALIFY, robust geospatial support, and a revamped streaming engine with auto CDC and sub‑millisecond state cleanup.

Apache SparkArrowCDC

0 likes · 13 min read

What’s New in Apache Spark 4.2? Core Features and Architecture Evolution

Past Memory Big Data

Jun 2, 2026 · Artificial Intelligence

Beyond 100% Accuracy: Key Metrics to Evaluate in Text2SQL Systems

The article argues that a 100% accuracy claim for Text2SQL is misleading without considering stability, coverage, and pass‑rate metrics, and it details a deterministic NLQ pipeline that converts natural language to a verifiable intermediate format before rule‑based SQL compilation.

AIAccuracyDatabase

0 likes · 16 min read

Beyond 100% Accuracy: Key Metrics to Evaluate in Text2SQL Systems

Past Memory Big Data

Apr 16, 2026 · Artificial Intelligence

From Shrimp to Horses: The AI Agent Landscape’s Species Migration

The article examines the rapid shift in the AI Agent ecosystem from the popular OpenClaw “shrimp” tool to the emerging Hermes Agent “horse”, detailing Hermes’s four‑layer memory architecture, native WeChat integration, cloud provider support, and the broader industry move toward agents that continuously learn and retain knowledge.

AI AgentCloud DeploymentHermes Agent

0 likes · 10 min read

From Shrimp to Horses: The AI Agent Landscape’s Species Migration

Past Memory Big Data

Apr 15, 2026 · Artificial Intelligence

Meta’s Tokenmaxxing Craze: One Engineer’s 281 B Monthly Token Burn

An internal Meta dashboard called Claudeonomics revealed that over 85,000 employees consumed more than 60 trillion AI tokens in a month, with the top user burning 281 billion tokens—costing over $1.4 million—highlighting a new “Tokenmaxxing” arms race and exposing the shortcomings of using token volume as a productivity metric.

AI productivityAI token consumptionEngineering metrics

0 likes · 8 min read

Meta’s Tokenmaxxing Craze: One Engineer’s 281 B Monthly Token Burn

Past Memory Big Data

Apr 13, 2026 · Big Data

11 Critical Pitfalls to Watch When Upgrading from Spark 3 to Spark 4

Spark 4.0 delivers 20‑50% performance gains and new features like Spark Connect, VARIANT types, and enhanced SQL, but it also introduces breaking changes such as mandatory JDK 17, dropping Scala 2.12, default ANSI mode, removal of Mesos, and altered JDBC type mappings, requiring careful planning and staged migration to avoid runtime failures.

ANSI modeApache SparkJDK 17

0 likes · 19 min read

11 Critical Pitfalls to Watch When Upgrading from Spark 3 to Spark 4

Past Memory Big Data

Apr 13, 2026 · Big Data

Why Iceberg v3 Marks the “iPhone Moment” for Data Lakehouses

Apache Iceberg v3 introduces deletion vectors, row‑level lineage, a native VARIANT type, default column values, and nanosecond timestamps, delivering up to ten‑fold faster updates, native CDC, seamless semi‑structured data handling, and industry‑wide adoption that effectively ends the format war between lake and warehouse solutions.

Apache IcebergData LakehouseDefault Column Values

0 likes · 14 min read

Why Iceberg v3 Marks the “iPhone Moment” for Data Lakehouses

Past Memory Big Data

Apr 11, 2026 · Artificial Intelligence

Hermes vs OpenClaw: What Am I Missing? The AI Agent Community’s Divisive Debate

A Reddit post sparked a heated debate over Hermes Agent and OpenClaw, leading to a deep technical comparison of their architectures, memory models, tool registration, security philosophies, deployment complexity, and ideal use‑cases, ultimately showing that each framework serves distinct AI Agent engineering paths.

AI AgentDeploymentHermes Agent

0 likes · 21 min read

Hermes vs OpenClaw: What Am I Missing? The AI Agent Community’s Divisive Debate

Past Memory Big Data

Mar 27, 2026 · Big Data

Why AI Workloads Require Rebuilding Parquet: A Deep Dive into Lance

The article explains how traditional Parquet‑based lakehouse architectures, optimized for large‑scale scans, struggle with AI workloads that need ultra‑low‑latency random access, and how Lance redesigns the storage format, indexing and write path to provide O(1) addressing, native vector support, and seamless integration with native execution engines.

AI workloadsData LakeLance

0 likes · 12 min read

Why AI Workloads Require Rebuilding Parquet: A Deep Dive into Lance

Past Memory Big Data

Mar 10, 2026 · Artificial Intelligence

Full-Stack Evolution of a Game Data Analysis Agent

This article chronicles the step‑by‑step development of a game‑data analysis agent, detailing three architectural versions, the challenges of domain terminology, LLM uncertainty, permission granularity, and the engineering solutions—including LangGraph, Dify, custom prompts, state management, security checks, token optimization, and deployment within an internal network.

Agent ArchitectureDeploymentGame Data Analysis

0 likes · 35 min read

Full-Stack Evolution of a Game Data Analysis Agent

Past Memory Big Data

Mar 9, 2026 · Industry Insights

Why Growing AI Agents Make Data Platforms Indispensable for Enterprises

The article explains that as AI agents move from demos to production, enterprises discover that the real bottleneck is not model capability but the underlying data platform, which must provide reliable data ingestion, semantic organization, access control, evaluation, and real‑time capabilities for agents to operate safely and effectively.

AI AgentsData PlatformData governance

0 likes · 11 min read

Why Growing AI Agents Make Data Platforms Indispensable for Enterprises