Big Data 18 min read

Building a Simple Yet Scalable Big Data Platform for Live Streaming with Consul

This article shares how a fast‑growing short‑video company designed a lean big‑data architecture, introduced the ALPS foundation service, and leveraged Consul to automate CMDB, job distribution, service discovery, and monitoring, enabling efficient growth with minimal operational overhead.

Efficient Ops
Efficient Ops
Efficient Ops
Building a Simple Yet Scalable Big Data Platform for Live Streaming with Consul

Instructor Introduction

Yu Bangxu joined the company in 2017 as the Big Data Director.

Preface

The company focuses on short‑video distribution and needs big data not just as a platform but as a growth engine. The goal is a simple‑enough data foundation that is "good enough" for rapid development.

Yizhibo & Xiaokaxiu Big Data Architecture

The architecture is compressed to core components: Elasticsearch, HDFS, Kafka, HBase, Spark, etc., to limit operational complexity. It follows a three‑layer model similar to IaaS, PaaS, SaaS.

The IaaS layer, called ALPS, includes Flume and Kafka for data transport, Yarn and Kubernetes for compute scheduling, and HDFS/HBase for storage. The team aims to run Spark, Flink, Storm on Yarn while exploring Kubernetes for unified resource orchestration.

Operations are kept minimal: two people maintain the ALPS foundation, three handle data integration services, and about twenty work on data‑driven applications such as BI, recommendation, and risk control.

ALPS Introduction

ALPS (named after the Alps) is a lightweight big‑data foundation service. The team selected open‑source tools (Puppet, Ansible, Falcon, Elastic, Prometheus) but found none that simultaneously satisfied CMDB, job distribution, and service discovery.

Consul was adopted to fill this gap: it provides CMDB via Consul Member, job distribution via Consul Event, and service discovery & monitoring via Consul Check. The architecture uses Consul Server clusters with Consul Clients on each data node.

Consul’s gossip‑based Event system enables fast, reliable broadcast of jobs and scripts across nodes, supporting batch operations such as log cleanup, configuration changes, and service migrations.

Consul DNS replaces manual /etc/hosts management for Hadoop nodes, offering automatic name resolution and failover without extra configuration.

Automation and Monitoring with Consul

Consul supports various health checks (Shell, TCP, HTTP, TTL, Docker, gRPC). Custom scripts collect metrics and store them in MongoDB, forming a simple monitoring system. The team also leveraged open‑source Bosun ideas for metric collection.

Examples include automatic node decommissioning on disk failure via Consul Event and DNS‑based service discovery with automatic failover.

Future Outlook

The team plans to share experiences from their CloudAtlas big‑data development suite, focusing on self‑service ETL (TT) and HBase‑based data warehousing to make data handling as easy as using Excel.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Big Datalive streamingAutomationOperationsData PlatformConsulALPS
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.