How a Chinese Bank Built a Real‑Time Log Management Platform with Apollo and Elasticsearch
Facing massive, multi‑system log volumes, China Minsheng Bank’s big‑data team designed a real‑time intelligent log platform by integrating Ctrip’s open‑source Apollo configuration center with Elasticsearch, enabling centralized, versioned, hot‑reloading configuration, role‑based parameter management, and high‑availability deployment across thousands of servers.
Background
China Minsheng Bank’s rapid IT expansion created a need for a unified, intelligent log analysis solution capable of handling over 1,000 servers and 10‑20 TB of daily logs across diverse OSes and applications (DB2, Oracle, MySQL, Redis, WebLogic, Kafka, etc.). Existing ES clusters suffered from fragmented configuration files, version chaos, and insufficient monitoring.
Why Apollo?
The team selected Ctrip’s open‑source Apollo as a configuration‑management center because it offers unified, environment‑aware configuration, real‑time push (hot‑publish), version control, gray‑release, permission governance, client monitoring, native Java/.Net SDKs, and a simple deployment model (only Java and MySQL required).
Apollo Core Features Used
Unified management of configurations across environments, clusters, and namespaces.
Real‑time configuration push (≈1 s latency) after publishing.
Versioned releases with rollback capability.
Gray‑release to a subset of instances before full rollout.
Fine‑grained permission and audit logging.
Client‑side configuration usage monitoring.
Java and .Net native clients plus HTTP API.
Apollo Architecture in the Log Platform
The platform runs three Apollo services (Config Service, Admin Service, Portal) on three high‑availability nodes, all backed by a single MySQL cluster. Clients (ES nodes) obtain configurations via long‑polling HTTP connections, with a fallback periodic pull every five minutes.
ES Role‑Based Configuration Design
Elasticsearch 5.5 clusters were split into four roles (master, client, hot, warm) with hot nodes using SSDs for fast reads/writes. Configuration files were divided into a default namespace (common parameters) and role‑specific namespaces (hot, warm, master, client). Role namespaces have higher priority, allowing targeted overrides without editing every node’s file.
Parameter File Split
Beyond elasticsearch.yml, each node’s jvm.options was also managed via Apollo. Heap settings ( -Xms, -Xmx) differ per role, so they were stored in role‑specific namespaces while other JVM options remained in the default namespace.
Elasticsearch Source Code Modification
The team modified InternalSettingsPreparer to load configuration from Apollo after the local elasticsearch.yml is read. The process merges values in the order: role‑specific > default > local file. Special handling was added for array‑type settings (e.g., discovery.zen.ping.unicast.hosts) and for parameters that must not include surrounding quotes.
Log output was enhanced to show old (local) and new (Apollo) values, confirming the override behavior.
JVM Configuration Development
Because the Elasticsearch startup script directly reads jvm.options, the team created a separate JAR ( Apollo-jvm.jar) that runs before the ES process, replaces any differing keys in jvm.options with values from Apollo, and then exits. The command used is:
java -jar /logger/Apollo/Apollo-jvm.jar -DApollo.cluster=$APOLLO /logger/elasticsearch/config$APOLLO_PATH/jvm.optionsKey‑value pairs are stored as properties in Apollo; single‑token options (e.g., -Xms2g) are represented as -Xms:-Xms2g and matched by prefix during replacement.
High Availability & Supervisord Integration
A three‑node Apollo cluster with MySQL master‑slave ensures configuration service continuity. Each ES node caches configuration locally, so loss of Apollo connectivity does not affect operation. Supervisord is used to manage ES processes, redirect stdout to per‑role log files, and provide a unified monitoring UI.
Supervisord configuration snippets illustrate environment variable injection for role‑specific Apollo settings and log file paths.
[program:elkwarm]
environment=ES_JVM_OPTIONS=%(ENV_ELK_WARM_JVM_OPTIONS)s,APOLLO=%(ENV_APOLLO_WARM)s,APOLLO_PATH=%(ENV_APOLLO_WARM_ONE)s
command=/logger/elasticsearch/bin/elasticsearch -Epath.conf=/logger/elasticsearch/configwarm
username=logger
autostart=true
autorestart=false
stdout_logfile=/loggerfiles/elasticsearch/log/warm/cmbc_elk_warm.log
stdout_logfile_maxbytes=10MB
stdout_logfile_backups=10
stdout_events_enabled=trueConclusion & Outlook
The project demonstrates that a well‑designed configuration‑management layer (Apollo) can dramatically simplify large‑scale Elasticsearch operations, improve reliability, and enable rapid parameter tuning. Future work includes extending the configuration center to other big‑data components (e.g., Logstash) and promoting the solution as a generic centralized configuration hub for the entire data platform.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
