Big Data 30 min read

How a Chinese Bank Built a Real‑Time Log Management Platform with Apollo and Elasticsearch

Facing massive, multi‑system log volumes, China Minsheng Bank’s big‑data team designed a real‑time intelligent log platform by integrating Ctrip’s open‑source Apollo configuration center with Elasticsearch, enabling centralized, versioned, hot‑reloading configuration, role‑based parameter management, and high‑availability deployment across thousands of servers.

dbaplus Community
dbaplus Community
dbaplus Community
How a Chinese Bank Built a Real‑Time Log Management Platform with Apollo and Elasticsearch

Background

China Minsheng Bank’s rapid IT expansion created a need for a unified, intelligent log analysis solution capable of handling over 1,000 servers and 10‑20 TB of daily logs across diverse OSes and applications (DB2, Oracle, MySQL, Redis, WebLogic, Kafka, etc.). Existing ES clusters suffered from fragmented configuration files, version chaos, and insufficient monitoring.

Why Apollo?

The team selected Ctrip’s open‑source Apollo as a configuration‑management center because it offers unified, environment‑aware configuration, real‑time push (hot‑publish), version control, gray‑release, permission governance, client monitoring, native Java/.Net SDKs, and a simple deployment model (only Java and MySQL required).

Apollo Core Features Used

Unified management of configurations across environments, clusters, and namespaces.

Real‑time configuration push (≈1 s latency) after publishing.

Versioned releases with rollback capability.

Gray‑release to a subset of instances before full rollout.

Fine‑grained permission and audit logging.

Client‑side configuration usage monitoring.

Java and .Net native clients plus HTTP API.

Apollo Architecture in the Log Platform

The platform runs three Apollo services (Config Service, Admin Service, Portal) on three high‑availability nodes, all backed by a single MySQL cluster. Clients (ES nodes) obtain configurations via long‑polling HTTP connections, with a fallback periodic pull every five minutes.

ES Role‑Based Configuration Design

Elasticsearch 5.5 clusters were split into four roles (master, client, hot, warm) with hot nodes using SSDs for fast reads/writes. Configuration files were divided into a default namespace (common parameters) and role‑specific namespaces (hot, warm, master, client). Role namespaces have higher priority, allowing targeted overrides without editing every node’s file.

Parameter File Split

Beyond elasticsearch.yml, each node’s jvm.options was also managed via Apollo. Heap settings ( -Xms, -Xmx) differ per role, so they were stored in role‑specific namespaces while other JVM options remained in the default namespace.

Elasticsearch Source Code Modification

The team modified InternalSettingsPreparer to load configuration from Apollo after the local elasticsearch.yml is read. The process merges values in the order: role‑specific > default > local file. Special handling was added for array‑type settings (e.g., discovery.zen.ping.unicast.hosts) and for parameters that must not include surrounding quotes.

Log output was enhanced to show old (local) and new (Apollo) values, confirming the override behavior.

JVM Configuration Development

Because the Elasticsearch startup script directly reads jvm.options, the team created a separate JAR ( Apollo-jvm.jar) that runs before the ES process, replaces any differing keys in jvm.options with values from Apollo, and then exits. The command used is:

java -jar /logger/Apollo/Apollo-jvm.jar -DApollo.cluster=$APOLLO /logger/elasticsearch/config$APOLLO_PATH/jvm.options

Key‑value pairs are stored as properties in Apollo; single‑token options (e.g., -Xms2g) are represented as -Xms:-Xms2g and matched by prefix during replacement.

High Availability & Supervisord Integration

A three‑node Apollo cluster with MySQL master‑slave ensures configuration service continuity. Each ES node caches configuration locally, so loss of Apollo connectivity does not affect operation. Supervisord is used to manage ES processes, redirect stdout to per‑role log files, and provide a unified monitoring UI.

Supervisord configuration snippets illustrate environment variable injection for role‑specific Apollo settings and log file paths.

[program:elkwarm]
environment=ES_JVM_OPTIONS=%(ENV_ELK_WARM_JVM_OPTIONS)s,APOLLO=%(ENV_APOLLO_WARM)s,APOLLO_PATH=%(ENV_APOLLO_WARM_ONE)s
command=/logger/elasticsearch/bin/elasticsearch -Epath.conf=/logger/elasticsearch/configwarm
username=logger
autostart=true
autorestart=false
stdout_logfile=/loggerfiles/elasticsearch/log/warm/cmbc_elk_warm.log
stdout_logfile_maxbytes=10MB
stdout_logfile_backups=10
stdout_events_enabled=true

Conclusion & Outlook

The project demonstrates that a well‑designed configuration‑management layer (Apollo) can dramatically simplify large‑scale Elasticsearch operations, improve reliability, and enable rapid parameter tuning. Future work includes extending the configuration center to other big‑data components (e.g., Logstash) and promoting the solution as a generic centralized configuration hub for the entire data platform.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaBig DataElasticsearchDevOpsApollo
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.