Tag

intelligent alerting

0 views collected around this technical thread.

Data Thinking Notes
Data Thinking Notes
Jan 10, 2023 · Big Data

How Bilibili Built a Scalable Data Quality Platform for Billions of Events

This article describes Bilibili’s data quality platform, outlining its background, objectives, theoretical models, workflow stages (recording, checking, alerting), DSL for metrics, root‑cause analysis, scheduling strategies, heterogeneous source integration, rule coverage, intelligent monitoring, and future plans to achieve automated, real‑time, high‑reliability data assurance for massive daily workloads.

Schedulingautomationbig data
0 likes · 21 min read
How Bilibili Built a Scalable Data Quality Platform for Billions of Events
Efficient Ops
Efficient Ops
Sep 28, 2022 · Operations

How Event‑Driven Alert Centers Revolutionize Intelligent Operations

This article presents a comprehensive overview of an event‑centric intelligent alert analysis platform, covering its evolution, core challenges, the concept of alert events, AI‑driven correlation techniques, and the MC‑Stack platform that powers modern operations.

AIOpsalert managementevent-driven monitoring
0 likes · 13 min read
How Event‑Driven Alert Centers Revolutionize Intelligent Operations
JD Retail Technology
JD Retail Technology
Jun 22, 2018 · Operations

JDOS Operations Platform: Managing Million‑Scale Container Clusters at JD.com

This article describes JD.com's JDOS Operations Platform, which enables two operators to manage millions of Docker and Kubernetes containers across massive clusters, detailing its architecture, regression analysis of scale, gossip‑based inspection system, intelligent alert convergence, and performance improvements for ultra‑large‑scale environments.

Container OrchestrationDockerLarge-Scale Operations
0 likes · 11 min read
JDOS Operations Platform: Managing Million‑Scale Container Clusters at JD.com
JD Tech
JD Tech
Jun 22, 2018 · Operations

JDOS Operations Platform: Managing Millions of Containers at JD.com

The article describes how JD.com built and operates the JDOS Operations Platform to manage a multi‑million‑container Docker and Kubernetes fleet, detailing the challenges of massive scale, the architectural components such as the configuration center, operation center, inspection system, gossip‑based communication, and an intelligent alerting system that together enable efficient, automated, and reliable large‑scale container operations.

Container ManagementLarge-Scale Operationsgossip protocol
0 likes · 12 min read
JDOS Operations Platform: Managing Millions of Containers at JD.com