Big Data 11 min read

How Meizu Built an Agile Big Data Platform for Millions of Users

The Meizu Tech Open Day showcased the company's rapid evolution to a data‑driven mobile internet firm, detailing its DW1.0 and DW2.0 data‑warehouse architectures, recommendation pipelines, Spark adoption, and ELK‑based log analytics, while sharing practical lessons and future challenges.

ITPUB
ITPUB
ITPUB
How Meizu Built an Agile Big Data Platform for Millions of Users

Meizu Data Warehouse Architecture (DW1.0)

DW1.0 integrates website logs, ERP data, and real‑time messages. It is organized into four layers: data source, ingestion, warehouse, and application. The ingestion layer uses AnyLoader for batch loading of files and databases, and AnyStream for near‑real‑time ingestion of NoSQL streams. The platform supports a data development environment and multiple data product libraries.

Meizu big data overall architecture
Meizu big data overall architecture

DW2.0 Roadmap

Goals include improving data‑product user experience, addressing core business pain points, delivering role‑specific personalized data products, and consolidating processes and integrations.

Best‑Practice Recommendations

Avoid non‑standard business designs and inconsistent data definitions.

Ensure reliable data sources and complete SDK tracking.

Prevent repeated platform migrations, maintain a unified data portal, and use consistent visualization tools.

Alibaba Recommendation Technology

Offline pipeline consists of four modules:

I2I similarity – collaborative filtering, co‑occurrence, purchase‑probability, log‑odds‑ratio, MutualInfo algorithms.

I2I pairing – compute category‑level relationships and map to item‑level pairings.

C2I – rank high‑quality items per leaf category using sales, CTR, quality metrics.

U2I – model long‑term user preferences across channels, consider purchase cycles, and filter already purchased similar items.

Real‑time Pipeline framework provides systematic componentization, unified interfaces, centralized logging for debugging, and full‑link data analysis. Key steps:

Source retrieval – collect search terms, clicks, purchase history, UGC, etc.

Source arbitration – rank sources for relevance, diversity, and context match.

Candidate set recall – multi‑channel recall (item‑item, user‑user, feature‑based, novelty, hot items).

Filtering – apply scenario‑specific filters.

Scoring and ranking – online fine‑grained ranking with real‑time models.

Post‑processing – handle diversity, novelty, business metrics, pagination, and explanation generation.

‘You May Also Like’ framework
‘You May Also Like’ framework
Alibaba expert An Weiting
Alibaba expert An Weiting

Baidu Spark Deployment

Spark is used for both batch and streaming workloads. Key advantages:

In‑memory processing reduces disk I/O and enables caching of intermediate and final results.

RDD lineage provides fault recovery without external checkpoints.

Rich transformation APIs simplify complex data pipelines.

Spark Streaming and Spark SQL allow seamless integration of real‑time and batch processing.

Spark in Baidu
Spark in Baidu

Meizu Log‑Analysis Platform

Daily log volume reaches hundreds of GB to several TB. The ELK stack (Elasticsearch, Logstash, Kibana) was selected for its stability, horizontal scalability, ease of use, and real‑time capabilities.

Elasticsearch stores distributed log data and supports fast search.

Logstash normalizes heterogeneous log formats.

Kibana provides a unified query UI for troubleshooting.

Meizu operations architect Lin Zhonghong
Meizu operations architect Lin Zhonghong
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Big Datarecommendation systemData WarehouseELKSparkData Architecture
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.