Operations 6 min read

Open-Falcon: Scalable Open-Source Monitoring System for Modern Operations

Open‑Falcon, an open‑source, enterprise‑grade monitoring solution from Xiaomi’s operations team, offers zero‑configuration data collection, high‑throughput horizontal scaling, flexible alerting, efficient historical queries, and a user‑friendly dashboard, with detailed documentation, quick installation steps, and a highly available architecture.

Efficient Ops
Efficient Ops
Efficient Ops
Open-Falcon: Scalable Open-Source Monitoring System for Modern Operations
Open-Falcon is an open‑source, enterprise‑grade monitoring system developed by Xiaomi’s operations team.

Github

https://github.com/xiaomi/open-falcon

Highlights and features

Zero‑configuration data collection : agent auto‑discovery, plugin support, active push mode.

Horizontal scalability : production environment handles 500,000 data points per second for collection, alerting, storage, and graphing, with seamless horizontal expansion.

Alert strategy auto discovery : web UI, policy templates with inheritance and overrides, multiple alert channels, and callback actions.

Human‑friendly alert settings : configurable max alert count, alert levels, recovery notifications, pause options, time‑based thresholds, maintenance windows, and alert merging.

Efficient historical data query : sub‑second response for hundreds of metrics over a year of history.

User‑friendly dashboard : multi‑dimensional data display with customizable dashboards.

Highly available architecture : no single point of failure, easy to operate and deploy.

Doc

Wiki

Open‑Falcon detailed introduction

Quick Install

Open‑Falcon consists of two main components: the graphing component and the alert component.

Graphing component handles data collection, storage, archiving, sampling, querying, and visualization (Dashboard/Screen); it can operate independently as a time‑series data storage and display solution.

Alert component handles alert policy configuration (portal), alert judgment (judge), alert handling (alarm/sender), and user‑group management (uic); it can also run independently.

Introduction

The monitoring system is a critical part of the operations workflow and the entire product lifecycle, providing early warnings to detect failures and detailed data for post‑mortem analysis. When a company is small and its operations team is just forming, choosing an open‑source monitoring system saves time and effort.

As the business grows rapidly, the number and complexity of monitored objects increase, making system capacity and user efficiency the most pressing concerns.

Many open‑source monitoring solutions exist. Initially we used Zabbix, but as our scale and specific internet‑company requirements grew, existing solutions could no longer meet performance, scalability, and usability needs.

Therefore, over the past year we designed and built Open‑Falcon, drawing on the experiences and feedback of SREs, developers, and industry best practices from large internet companies.

Screenshots

Dashboard Homepage

Dashboard Screen

Large Image

Portal template

Contributors

Xiaomi Operations Team: Blog http://noops.me

laiwei

秦晓辉 (Ulric Qin)

yubo

聂安 (niean)

License

Copyright 2014‑2015 Xiaomi, Inc. Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

monitoringoperationsAlertingopen sourceDashboardscalable
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.