Databases 22 min read

From LAMP to Cloud‑Native: Evolving Application Data Architecture and Best Practices

This article traces two decades of application data architecture evolution, comparing traditional single‑system LAMP designs with modern multi‑component cloud‑native stacks, and offers practical guidance on scaling, component selection, CDC‑based data derivation, and cloud‑native implementations such as Tablestore.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
From LAMP to Cloud‑Native: Evolving Application Data Architecture and Best Practices

Introduction

Over the past twenty years, application forms and technical architectures have undergone major upgrades, driven by new scenarios, larger scale challenges, and improved infrastructure such as the Internet, 4G/5G, distributed systems, and cloud computing.

The article classifies data systems into two categories: application‑centric (business‑driven) and data‑centric (specific data type‑driven).

Application System Data Architecture

Application data architecture has evolved from a single‑system model to a modern multi‑component architecture, where each component is chosen for its strengths in handling different data types and loads.

1. Traditional Data Architecture (Single System)

LAMP Architecture

The classic LAMP stack (Linux, Apache, MySQL, PHP) provides a low‑cost, open‑source solution but faces scalability limits as traffic grows.

How to Scale

Scale‑up vs Scale‑out

Storage‑compute separation

Scale‑out techniques include data sharding for storage, state‑based routing or replication for compute, and stateless versus stateful compute models.

Scalable Traditional Architecture

Even with scaling techniques, LAMP‑based systems suffer from high storage‑side costs, limited flexibility, and expensive MySQL scaling.

Solving Storage‑Side Scaling

MySQL remains indispensable, but auxiliary components like Redis (caching) and other services offload query traffic and storage load.

2. Modern Data Architecture (Diverse Systems)

Problem Definition and Divide‑and‑Conquer

Identify MySQL responsibilities and separate them into traffic‑offload and storage‑offload tasks.

Traffic offload: replicate or cache read/write traffic.

Data offload: move non‑transactional data to cheaper storage.

Example: an e‑commerce order system uses CDC to stream changes to downstream stores.

Choosing Storage Components

1) Define Requirements by Scenario

Consider SLA, reliability, scalability, and operability alongside functional needs.

2) Types and Differences of Storage Components

Data model & query language (relational, document, wide‑column, time‑series).

SQL vs NoSQL.

Database vs Data Warehouse.

Cloud‑hosted vs Cloud‑native.

Derived Data Architecture

Primary storage (MySQL) holds the source of truth, while auxiliary storage (Redis, Elasticsearch, HBase, ClickHouse) provides read‑heavy, search, or analytical capabilities.

Data synchronization methods include application‑level multi‑write, asynchronous queue replication, and CDC (Change Data Capture), with CDC being the most application‑friendly.

Modern Application System Data Architecture

MySQL – primary transactional store.

Redis – query result cache.

Elasticsearch – full‑text and complex query offload.

HBase – wide‑column store for non‑transactional data.

ClickHouse – MPP analytical warehouse.

Data flows are managed via MySQL CDC (e.g., Canal) and optionally T+1 full sync for ClickHouse.

3. Cloud Data Architecture Practice

Moving to the cloud replaces open‑source components with managed services to reduce operational cost and improve stability.

DTS – managed MySQL CDC service.

Tair – enterprise‑grade Redis.

Tablestore – Serverless wide‑column store (compatible with HBase) with advanced indexing and built‑in CDC.

ADB – real‑time analytical database compatible with MySQL protocol.

1. Tablestore

Tablestore is Alibaba Cloud’s Serverless multi‑model storage supporting wide‑column, timeline, and time‑series models, offering high throughput, PB‑scale capacity, and built‑in SQL support.

Multiple models: WideColumn, Timeline, Timestream.

Rich indexing: secondary and multi‑dimensional indexes surpassing Elasticsearch.

Storage‑compute separation with independent billing.

Serverless, zero‑ops, global deployment.

Integrated CDC (Tunnel Service) for real‑time data subscription and Flink integration.

Conclusion

Choosing a technology stack involves trade‑offs; classic design principles such as divide‑and‑conquer, derived data architecture, and flexible component composition remain valuable as applications grow more complex.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

mysqldatabasesData ArchitectureCDCstorage scaling
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.