Operations 16 min read

How Suning Scaled Its API Platform: Standards, High Availability, and O2O Event Readiness

This article explains how Suning built a standardized, high‑availability API gateway, detailing naming conventions, documentation practices, protocol choices, error‑code design, dynamic configuration, SDK automation, system refactoring, monitoring, intelligent alerting, and the specific preparations made for the O2O shopping festival.

Suning Technology
Suning Technology
Suning Technology
How Suning Scaled Its API Platform: Standards, High Availability, and O2O Event Readiness

Introduction

In 2012, amid the Open Cloud era, Suning opened its APIs to merchants, suppliers, logistics, and software partners, enabling seamless integration and efficient business processing.

1. All Based on Standards

Suning learned the importance of consistent API naming, documentation, and error handling. A standard API name follows the pattern suning.business.module.operation, uses lowercase English, fixed four‑segment format, and is concise and extensible.

suning.order.get
suning.selfmarket.order1.query
suning.custom.book.item.query
suning.retuenBadArticleHandleResults.add
suning.shoppingmallsalesdata.saveandupdate

Problems with the original names included varying lengths and inconsistent styles.

2. Documentation Standardization

A well‑structured API document helps users understand purpose and usage. The document typically includes overview, request/response schemas, error codes, and examples.

API documentation components
API documentation components

3. Protocol Mainstream

Suning APIs use HTTPS with OAuth2.0, supporting XML or JSON payloads, and provide SDKs for Java, .Net, PHP, and Python. The global entry point is https://open.suning.com/api/http/sopRequest.

4. Service Contract Fixation

Fixing the service contract between the API gateway and internal systems is a prerequisite for standardization and enables rapid API publishing.

5. Error‑Code Standardization

Each error code is unique, consistently styled, hierarchical, and extensible, allowing precise monitoring and alerting.

Uniqueness

Consistent style

Layered classification

Extensibility

6. API Configuration

APIs are configured via a UI that supports basic info, external request mapping, and internal service routing.

Basic information configuration
Basic information configuration
External request API
External request API
Internal API request
Internal API request

7. SDK Automation

After an API is published, SDKs for Java, .Net, PHP, and Python are generated automatically, tested, and uploaded for external use.

System Refactoring Details

Increasing API count exposed performance bottlenecks. Initial architecture used IBM IHS, WebSphere, and DB2, with each API directly accessing business databases, leading to slow response times and limited concurrency.

Legacy architecture
Legacy architecture

Refactoring introduced Nginx + Wildfly, flow‑control, monitoring, and Hessian‑based service calls, with logs sent to Kafka and indexed by Elasticsearch.

New architecture
New architecture

Further improvements added interface sharding, Netty‑based API services, asynchronous RSF calls, Zookeeper configuration, and multi‑group worker pools to isolate workloads.

Core component refactor
Core component refactor

High‑Availability Design

The platform employs CDN, hardware firewalls, load balancers, WAF (Nginx+Lua), and layered flow control to ensure stability.

Request chain
Request chain

Monitoring Practices

Log collection moved from Flume/Hadoop to real‑time Kafka → Elasticsearch pipelines, enabling near‑real‑time query and alerting. Interface statistics are computed offline with Hadoop/Hive and online with the Libra real‑time platform (based on Storm/EPL).

Libra real‑time platform
Libra real‑time platform

Intelligent Alerting

Suning integrates Zabbix, the MuJia platform, and the CloudTrace anomaly system, providing SMS, email, and instant‑messenger notifications. API‑level alerts trigger on latency, call volume, failure rate, and error‑code spikes.

O2O Shopping Festival Support

For the annual O2O shopping festival, Suning prepared capacity assessments, performance tests, and emergency pre‑plans (service degradation, feature toggles). During the event, 24‑hour on‑call teams monitored system health, executed pre‑plans, and performed post‑event analysis and optimization.

Monitoring dashboard
Monitoring dashboard
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

cloud computinghigh availabilitystandardizationapi-design
Suning Technology
Written by

Suning Technology

Official Suning Technology account. Explains cutting-edge retail technology and shares Suning's tech practices.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.