Information Security 16 min read

Design and Implementation of JD Daojia Security Operations Center (SOC) Platform

This article details the challenges, design choices, deployment steps, detection model creation, data processing, visualization, and future plans of JD Daojia's security operations platform, highlighting the use of Graylog, Elasticsearch, and MongoDB to achieve scalable, real‑time threat detection and response.

Dada Group Technology

Jun 20, 2022

Design and Implementation of JD Daojia Security Operations Center (SOC) Platform

1. Introduction

Security operation focuses on assets and uses security event management as a core process, relying on a security operations platform to build real‑time asset risk models, perform event analysis, risk analysis, warning management, and emergency response, thereby ensuring the safe operation of enterprise systems and services.

JD Daojia faces massive malicious network attacks; disparate alerts from various security devices lead to mis‑detections and delayed response. Building a SOC aims to collect and correlate logs from different business systems and security devices, improve detection efficiency, and achieve closed‑loop network security management.

2. Challenges

The platform must align with JD Daojia's business direction and existing resources, demanding high scalability. Log data sources must satisfy attack‑chain detection needs, and existing security devices must support data aggregation and correlation.

Ensuring data sources meet attack‑chain detection requirements.

Making the platform’s security detection capabilities meet business requirements.

Providing collaborative analysis capabilities.

3. Security Operations Platform Overview

The platform uses business logs as data sources to detect attacks and must handle billions of log entries. An open‑source log analysis platform was selected for further development.

3.1 Open‑Source Log Analysis Platform Selection

Three mainstream platforms—ELK, Loki, and Graylog—were compared.

Platform

Analysis

ELK

Open‑source suite (Elasticsearch, Logstash, Kibana) supporting multi‑source log collection, distributed search, and visual UI.

Loki

Easy to operate, does not index logs fully, uses Prometheus‑style labels, suitable for Kubernetes pod logs.

Graylog

Integrated deployment, supports multi‑source log collection, field modification, TB‑level queries, archiving, superior alerting, and Python library support.

Graylog was chosen because it offers better alerting and meets the platform’s requirements.

3.2 Platform Design

Based on Graylog, the platform consists of four modules: data source, data storage, detection & analysis, and visualization.

3.2.1 Data Source Module

Collects logs from infrastructure (web access, host logs) and security devices (WAF, HIDS, firewall alerts).

3.2.2 Data Storage Module

Uses Elasticsearch to store collected logs and MongoDB to store Graylog operation logs.

3.2.3 Detection & Analysis Module

Core module containing rule engine, analysis engine, and alarm engine.

Rule Engine: Matches logs against defined security detection rules.

Analysis Engine: Processes matched data for secondary analysis and storage.

Alarm Engine: Provides alert notification capabilities within the platform.

3.2.4 Visualization Module

Displays abnormal scenarios identified by the detection module, including alarm data flow, dashboards, and threat posture views.

4. Platform Construction

Construction includes Graylog deployment, log ingestion, detection model generation, alarm data processing, and visualization.

4.1 Graylog Deployment

Graylog forms the core infrastructure, deployed alongside Elasticsearch and MongoDB.

4.2 Log Ingestion

Infrastructure logs (e.g., web access) are forwarded via Graylog Agent to Elasticsearch. Security device logs (WAF, HIDS, firewall) are collected via device‑specific interfaces and normalized.

Unified log ingestion enables complete attack‑chain reconstruction for threat tracing.

4.3 Detection Model Generation

Detection models consist of rules for various attack behaviors. Graylog’s rule engine processes massive log streams, extracting URLs, headers, bodies, status codes, etc., to reduce false positives.

Different data sources require tailored rules to avoid duplicate detection, false alarms, or missed alerts.

Web logs (NGINX) are analyzed for both network attacks and malicious business behavior (e.g., fraudulent orders). Distinguishing logged‑in vs. non‑logged‑in attacks helps prioritize response.

Security device logs (cloud WAF, local devices) are aggregated, filtered, and correlated with web logs to reduce false positives and enable joint analysis.

4.4 Data Processing

After rule‑engine detection, alerts are enriched and stored by the analysis engine.

Enriched alerts are pushed via Enterprise WeChat for immediate analyst response.

4.5 Visualization

Processed data is displayed through dashboards, showing web and device alerts, enabling rapid attack chain visualization.

Each alert can be queried by IP, device fingerprint, or time to trace the full attack chain.

4.6 Detection Workflow

The overall workflow includes Graylog deployment, log forwarding, model detection, data enrichment, and visualization, as illustrated below.

Logs are normalized, stored, matched against attack rules, labeled, visualized, and sent as WeChat work orders for incident handling.

5. Achievements and Future Plans

5.1 Achievements

The platform enables threat management, event correlation, and work‑order management, providing comprehensive security capabilities.

Threat management: centralized view of all threat alerts.

Event correlation: joint analysis of business system and device alerts for attack tracing.

Work‑order management: automatic WeChat notifications and consolidated view of all alerts.

5.2 Future Plans

Future work includes automated asset discovery and integration, and automated attack‑chain identification using correlated logs, timestamps, IPs, and fingerprints.

Asset linkage automation: script‑driven onboarding of new network assets.

Automated attack‑chain detection: combine platform alerts with host logs to auto‑generate full attack chains.

6. Conclusion

The security operations system gives JD Daojia clear insight into its security posture, supports stable business operation, and will continue to evolve to meet emerging threats through iterative optimization.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

incident response Data Visualization log analysis Security Operations Threat Detection SoC Graylog

Written by

Dada Group Technology

Sharing insights and experiences from Dada Group's R&D department on product refinement and technology advancement, connecting with fellow geeks to exchange ideas and grow together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.