Big Data 10 min read

Why Serverless Big Data Is the Future of Scalable Analytics

The article traces the evolution from on‑premise relational databases to self‑built Hadoop clusters, cloud‑hosted Hadoop, and finally to semi‑managed and serverless big‑data services, highlighting their advantages, challenges, and the four key pillars—security, elasticity, intelligence, and usability—that will shape the future of serverless big‑data analytics.

Huawei Cloud Developer Alliance

Jun 5, 2020

Why Serverless Big Data Is the Future of Scalable Analytics

Evolution of Big Data Architecture

For many enterprises, daily T+1 reporting—reviewing the previous day's business, inventory, and user metrics—drives the need for massive data processing, which in turn requires a robust big‑data architecture.

Relational Database Phase

Initially, analysts ran queries on standby replicas of business databases during off‑peak hours (e.g., nights), but growing data volumes made this approach cumbersome.

On‑Premise Hadoop Clusters

In 2004 Google published the MapReduce paper, and Hadoop was released in 2006. Companies began building their own Hadoop clusters and adopting ecosystem components such as Kafka, Hive, Spark, and Flink.

Cloud‑Hosted Hadoop

With the rise of public clouds like AWS and Alibaba Cloud, teams could deploy Hadoop on virtual machines, gaining elastic scaling that matched fluctuating workloads.

Semi‑Managed Cloud Big Data Services

Vendors introduced half‑managed services (e.g., AWS MRS, Huawei MRS) that simplify installation, upgrade, and operation while preserving open‑source compatibility, reducing migration effort.

Serverless Big Data Services

AWS launched Athena in 2016, and Huawei introduced DLI in 2017, allowing users to run standard SQL directly on data lakes without managing servers, charging only for query execution.

Key characteristics identified by a 2019 Berkeley paper:

Decoupling of storage and compute.

Automatic resource provisioning for code execution.

Pay‑as‑you‑go billing based on usage.

Both semi‑managed and serverless services will coexist: large enterprises often choose semi‑managed for flexibility and skill development, while cost‑sensitive SMEs favor serverless for its on‑demand, low‑maintenance model.

Four Pillars for Serverless Big Data Success

Security : Isolation methods such as sandboxing, container isolation, or physical isolation are required to mitigate cross‑tenant attacks.

Elasticity : Fast scaling and predictive scaling based on workload patterns and historical execution data are essential.

Intelligence : Automated tuning of service parameters and data organization reduces the manual optimization burden.

Usability : Seamless UI design, compatible APIs, and scriptable job operations that mirror open‑source tools lower the learning curve.

Conclusion

Serverless big data services represent a forward‑looking model that, as current challenges are addressed, will increasingly dominate data analytics, making powerful analysis as readily available as water and electricity.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Cloud Computing data analytics Hadoop

Written by

Huawei Cloud Developer Alliance

The Huawei Cloud Developer Alliance creates a tech sharing platform for developers and partners, gathering Huawei Cloud product knowledge, event updates, expert talks, and more. Together we continuously innovate to build the cloud foundation of an intelligent world.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.