How Hisense Juhau Revamped Its Big Data Platform for Real‑Time Intelligence
Hisense Juhau, an AI‑enabled TV cloud service, overhauled its massive offline‑centric data platform by adopting a real‑time data lake, compute‑storage separation, and serverless Spark/StarRocks on Alibaba Cloud, achieving sub‑5‑minute data freshness, elastic scaling, and dramatically improved performance for personalized content recommendation and smart operations.
Company Overview
Hisense Juhau (聚好看) is an internet TV cloud service provider under Hisense Group, serving over 120 million households with AI‑enabled smart TV experiences.
Challenges
Rapid shift to fine‑grained, personalized services required near‑real‑time data insight, dynamic user profiles, and minute‑level operational loops, exposing limitations of the traditional batch‑centric Lambda architecture such as long data pipelines, coupled compute‑storage, high expansion cost, and lake ingestion latency.
Architecture Upgrade
In partnership with Alibaba Cloud, Juhau rebuilt its platform using the full open‑source stack (EMR on ECS, Serverless Spark, Serverless StarRocks) and adopted Apache Paimon as the unified lake format, achieving:
Real‑time data lake
Compute‑storage separation
Serverless compute model
Continuous performance optimization
Real‑time Data Lake Solution
By introducing Paimon with Serverless Spark, Juhau enabled stream‑batch unified ingestion, reducing data‑to‑lake latency from hours to under five minutes and supporting billion‑scale device data with minute‑level freshness.
Compute‑Storage Separation
Data migrated to Alibaba Cloud OSS, decoupling storage from compute. EMR Serverless Spark and StarRocks clusters connect via high‑speed internal network, allowing elastic scaling, reducing NameNode pressure, and supporting multi‑engine data sharing.
Serverless Compute
Serverless Spark and StarRocks provide on‑demand, second‑level elastic compute. Jobs can scale to thousands of vCores within a minute, aligning resources with business‑driven SLAs and cutting TCO by over 30%.
Performance Optimizations
Leveraging EMR Serverless Spark, Juhau integrated Fusion Engine (vectorized Spark), Celeborn Remote Shuffle Service, and Apache Paimon‑based small‑file compaction, delivering up to 5× faster query execution, 30% overall performance gain, and >90% reduction in small files.
Outcome
The upgraded platform delivers sub‑5‑minute data freshness, elastic resource provisioning, higher stability, and supports future AI model training and cross‑scene smart services.
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
