Cloud Computing 15 min read

How Nanguo Film Migrated 30+ Services to Alibaba Cloud Serverless in Just 7 Days

In a seven‑day sprint, Nanguo Film transformed its entire streaming platform by moving over 30 systems to Alibaba Cloud's Serverless Application Engine, cutting operational effort by 70%, reducing costs by more than 40%, and achieving ten‑fold faster scaling while maintaining zero downtime.

Alibaba Cloud Native

Nov 9, 2021

How Nanguo Film Migrated 30+ Services to Alibaba Cloud Serverless in Just 7 Days

Pain Points

The original architecture ran entirely on Alibaba Cloud ECS instances. Operational bottlenecks included:

Slow elastic scaling – during traffic spikes a new ECS had to be purchased and manually provisioned, causing SLA violations.

Lengthy, error‑prone release cycles – hundreds of servers required manual updates for each deployment.

High maintenance overhead – operations required expertise in Lua/Ansible scripts, cloud networking, and monitoring.

Poor resource utilization – capacity was sized for peak load, leaving most of the fleet idle during off‑peak periods.

Complex permission management – RAM policies were applied at the machine level, making multi‑tenant access cumbersome.

Selection Process

Three migration paths were evaluated:

Deep script optimization : could automate some tasks but still depended on skilled ops personnel and manual ECS procurement.

Self‑built Kubernetes : offered high density and auto‑scaling but required a steep learning curve and a dedicated ops team.

Alibaba Cloud Serverless Application Engine (SAE) : provided instant WAR/JAR deployment, unlimited elastic resources, and minimal operational overhead.

SAE was chosen as the final solution.

Implementation Rounds

Round 1 – CI/CD Pipeline

Integrated Travis CI with SAE to replace the ECS deployment workflow. The pipeline performs:

Run unit tests on each commit.

Upload build artifacts to a private OSS bucket.

Deploy the artifact to SAE using the same deploy step that previously targeted ECS.

SAE supports single‑batch, canary, and rollback strategies, enabling fast, reliable releases.

Round 2 – First Application Migration (API Gateway)

The API gateway, the highest‑traffic service, was selected first because it already spanned multiple regions and could run in parallel on ECS and SAE. Traffic was gradually shifted to SAE while keeping the ECS instances as a hot standby.

Round 3 – Auto‑Scaling Under Surge

A stress test using five times the traffic of a blockbuster release was executed. SAE auto‑scaling rules were configured with thresholds for CPU, memory, QPS, and response time. SAE scaled out within seconds and scaled back down during low load, delivering roughly 40% hardware cost savings compared with a permanently provisioned ECS fleet.

Round 4 – Full‑Link Monitoring & Diagnosis

SAE’s built‑in ARMS monitoring provides:

Topology maps of service calls.

Slow‑SQL and slow‑service detection.

Method‑level call‑stack traces.

Top‑N application reports for quick prioritization.

These features reduced troubleshooting time dramatically.

Round 5 – Enterprise‑Grade Permission Isolation & Approval

Permission management shifted from machine‑level RAM policies to application‑level roles. A single grant per application is sufficient. SAE also enforces a main‑account approval workflow for any sub‑account operation, preventing unauthorized changes.

Round 6 – Completion

Within seven days all 30+ services (hundreds of servers) were fully migrated to SAE. The migration required only 1–2 developers and incurred zero incidents.

Results & Benefits

Scaling speed increased from hours to seconds; no over‑provisioning or under‑provisioning.

Release cycles accelerated via CI/CD and one‑click CloudToolkit deployments.

Operations became largely hands‑off; alerts trigger automatic remediation.

Integrated monitoring shortened problem‑diagnosis time.

Overall development efficiency improved by ~70%, cost reduced >40%, and scaling efficiency grew >10×.

Key Takeaways

Deploy applications across multiple availability zones for resilience.

Use batch, canary, or gray‑release strategies for multi‑instance services.

Implement health‑check scripts and run them before deployment to avoid start‑up failures.

Derive scaling thresholds from thorough load‑testing; prefer conservative (lower) thresholds to prevent outages.

Configure SLS logging and ARMS alerts to enable effective post‑incident analysis.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Serverless cloud migration CI/CD Operations Alibaba Cloud SAE

Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.