Operations 13 min read

Business Monitoring Solutions and Log Practices for KA Merchants

This article details the background, design, implementation, and best‑practice guidelines for business‑level monitoring, unified logging formats, log4j configurations, alert rules, and case studies of common issues faced by KA merchants in logistics operations.

JD Tech Talk
JD Tech Talk
JD Tech Talk
Business Monitoring Solutions and Log Practices for KA Merchants

Background – In routine operations, system‑level metric anomalies often coincide with business‑level metric anomalies, but the reverse is not always true, causing delayed detection of business issues and potential large‑scale impact.

Business Monitoring Scheme – A generic data‑monitoring flow collects instrumented data, aggregates metrics, sets threshold‑based alerts, and visualizes them on dashboards. Three internal DevOps platforms (UMP, PFinder, Taishan) are used for KA merchant monitoring.

UMP Monitoring – The earliest solution, now offline, still runs for some applications.

PFinder Monitoring – Tracks package count thresholds in fast‑delivery order flow, triggering alerts when limits are exceeded.

Taishan Monitoring – The most widely used platform, covering unified log format, coding practices, data visualization, alerting, and best practices.

Unified Log Format

The log schema includes fields such as business domain, sub‑domain, scenario, channel source, merchant code, density, result (Y/N), result code/description, sub‑code/description, merchant order number, internal order number, and waybill number.

|业务域|业务子域|业务场景|渠道来源|商家编码|密度|结果(Y/N)|结果码|结果码描述|结果子码|结果子码描述|商家单号|订单号|运单号

Examples of successful and failed logs are provided.

Coding Practices

log4j Configuration

<property name="patternLayout">%d{yyyy-MM-dd HH:mm:ss.SSS}-%X{PFTID}-%-5p - [%t] %c -%m%n</property>
<RollingRandomAccessFile name="businessFile" fileName="${log_path}/eclp-biz-eclp-isv-business.log" filePattern="${log_path}/eclp-biz-eclp-isv-business-%i.log">
    <PatternLayout charset="UTF-8" pattern="${patternLayout}"/>
    <Policies>
        <SizeBasedTriggeringPolicy size="1GB"/>
    </Policies>
    <DefaultRolloverStrategy max="5"/>
</RollingRandomAccessFile>

AsyncLogger references the appender:

<AsyncLogger name="BusinessLogger" level="INFO" additivity="false" includeLocation="false">
    <AppenderRef ref="businessFile"/>
</AsyncLogger>

Business logger definition in code:

/**
 * Business log
 */
private static final Logger blogger = LoggerFactory.getLogger("BusinessLogger");

Logging a business event:

blogger.info("|订单域|销售出|下单|{}|{}|{}|{}|{}|{}|{}|{}|{}|{}|{}|{}",
        order.getSourceChannel(), order.getShopNo(), order.getDepartmentNo(), 1,
        result, code, message, subCode, subMessage,
        order.getIsvUUID(), context.getPin(), soNo);

Data Visualization

Taishan dashboards display minute‑level metrics per department, including success rate, failure count, order volume tables, and result breakdowns.

Alert Rules

Success‑rate alerts trigger when consecutive periods fall below 50%. Volume surge/dip alerts use day‑over‑day and week‑over‑week comparisons to avoid false positives, with absolute thresholds to filter low‑traffic noise.

Best Practices (Case Studies)

Merchant warehouse migration causing low success rates until SKU switch completed.

Duplicate order submissions due to FTP‑based batch processing.

Stock shortage from 2B/2C warehouse transfer delays.

Department switch leading to temporary zero success rate.

Product level adjustments causing inventory mismatches.

External API (Tencent Map) timeouts affecting O2O shipments.

OAID verification failures after recipient info changes.

Collection‑time validation failures for re‑pushed orders.

Common input‑parameter errors requiring merchant verification.

Upstream traffic anomalies causing sudden volume drops or spikes.

Conclusion

Extensive monitoring has been built for most KA merchants, enabling rapid detection and resolution of issues, improving system availability and merchant experience. Monitoring is a means to the ultimate goal of higher service reliability.

alertingbest practiceslog4jBusiness MonitoringLog Configuration
JD Tech Talk
Written by

JD Tech Talk

Official JD Tech public account delivering best practices and technology innovation.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.