Cloud Native 23 min read

How Nacos and Spring Cloud Alibaba Powered MasterClass’s Cloud‑Native Microservice Migration

This article details MasterClass Education’s end‑to‑end cloud‑native migration using Spring Cloud Alibaba, Nacos, Sentinel and related tools, covering registry selection, Nacos server deployment, monitoring, logging, performance testing, high‑availability sync with Eureka, CI/CD pipelines, gray releases, APM integration and disaster‑recovery drills.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
How Nacos and Spring Cloud Alibaba Powered MasterClass’s Cloud‑Native Microservice Migration

MasterClass Education transformed its online‑learning platform into a cloud‑native microservice architecture, leveraging Spring Cloud Alibaba, Nacos, Sentinel, Arthas and other components, and deploying on Docker and Alibaba Cloud Kubernetes.

Background

Rapid business growth and pandemic‑driven demand increased traffic and microservice count, exposing the limitations of the legacy Eureka registry and prompting a move to a more robust solution.

Why Choose Spring Cloud Alibaba & Nacos?

After evaluating Alibaba Nacos, HashiCorp Consul and K8s CoreDNS, Nacos was selected for its AP+CP consensus, multi‑language DNS‑F registration, load‑balancing, avalanche protection, and multi‑data‑center support, making it a perfect fit for the existing Spring Cloud Alibaba stack.

Conclusion on Registry Comparison

Nacos satisfies MasterClass’s service‑governance needs, enables smooth migration from Eureka, and benefits from an active community and rich feature set.

Nacos Server Deployment

Four isolated environments (DEV, FAT, UAT, PROD) are deployed; DEV uses a single node, while the others run three‑node clusters accessed via domain names behind SLB load balancers. Both SDK and dashboard connections use the domain‑based endpoints.

Environment & Domain Isolation

Namespaces isolate services per environment; the enabled flag in NacosDiscoveryProperties can disable local connections when needed.

LDAP Integration

The dashboard authenticates users against the corporate LDAP directory and records first‑login information.

Dashboard Permissions

Default users have read‑only access; only administrators can modify services.

Service Overview

The dashboard displays total services and instances, refreshed every five seconds.

Monitoring

Standard monitoring uses the company’s Prometheus, Grafana and Alertmanager stack. Advanced monitoring follows the Nacos monitoring guide, exposing custom metrics to Prometheus and visualizing them in Grafana.

Instance State Monitoring

Heartbeat, registration, deregistration and timeout events are tracked to ensure service health.

Logging

Logs from all Nacos modules are merged by level (info, warn, error), tagged with a schema field, and output in JSON format for ELK ingestion.

Alerting

Alerts fire on service instance up/down events and on naming convention violations (e.g., uppercase service names).

Performance Testing

def registry(ip):
    fo = open("service_name.txt", "r")
    str = fo.read()
    service_name_list = str.split(";")
    service_name = service_name_list[random.randint(0, len(service_name_list)-1)]
    fo.close()
    client = nacos.NacosClient(nacos_host, namespace='')
    print(client.add_naming_instance(service_name, ip, 333, "default", 1.0, {'preserved.ip.delete.timeout':86400000}, True, True))
    while True:
        print(client.send_heartbeat(service_name, ip, 333, "default", 1.0, "{}"))
        time.sleep(5)

Testing a 3‑node 1C4G Nacos cluster with 1,499 services and 12,715 instances showed CPU and memory staying within acceptable ranges, confirming Nacos’s performance suitability.

Nacos‑Eureka Sync Implementation

To achieve bidirectional synchronization between Nacos and Eureka, several designs were explored:

Official sync solution – simple but not HA and failed under load.

High‑availability consistent‑hash + Zookeeper – distributes sync tasks using a hash ring and watches Zookeeper nodes for failures.

Primary‑backup + Zookeeper – higher cost and less elegant, ultimately discarded.

Consistent‑hash + Etcd – persists service lists in Etcd, uses Etcd watches for re‑hashing, and serves as the bridge for bidirectional sync.

Sync Architecture

Sync services pull service lists from both registries, persist tasks in Etcd, and watch Etcd for node health. Consistent‑hash routing assigns tasks to live nodes; failed nodes trigger re‑hashing and task redistribution. Etcd leases with TTL ensure rapid detection of node loss.

Sync UI and Monitoring

The UI shows real‑time sync status, allows manual addition/removal of sync tasks, and integrates with the DevOps release platform for automated task creation when services are deployed.

Alerting and Upgrade Drills

Extensive disaster‑recovery drills demonstrated that the sync cluster could tolerate up to eight of nine nodes failing, with automatic re‑hashing and recovery times under one minute. Upgrade procedures for FAT, UAT and PROD environments completed without service disruption.

Solar Cloud‑Native Microservice Practice

MasterClass built the “Solar” framework on top of Spring Cloud Alibaba, providing SDKs for Nacos, Sentinel, gray‑release, environment isolation, and DevOps integration.

Nacos SDK

Encapsulates four environment domains, supports cross‑registry blue‑green releases, integrates with SkyWalking for tracing, and offers annotations like @EnableSolarService to simplify usage.

Sentinel SDK

Provides environment‑aware Sentinel endpoints, persists rules to Apollo, supports OpenTracing & SkyWalking integration, and adds cluster‑level flow control and custom limit‑app rules for gray releases.

Gray Release & Environment Isolation

Uses Spring Cloud Alibaba, Nacos SDK and Nepxion Discovery to implement version‑based and region‑based routing, with UI panels for condition‑driven and weight‑based traffic splitting.

DevOps Release Platform

Offers semi‑automatic gray releases, rolling no‑downtime deployments, and integrates with the sync system to keep registries consistent.

APM with SkyWalking

Collects service, instance and endpoint metrics, visualizes anomalies, and correlates logs and traces. Sentinel metrics are also exported to SkyWalking.

Application Diagnosis (Arthas & Bistoury)

Provides a web console for JVM diagnostics, hotspot analysis and online debugging without SSH access.

CI/CD Pipeline

Jenkins builds JARs and Docker images, pushes them to Harbor, and a custom Hyperion service invokes Alibaba Cloud Kubernetes APIs for deployment. Harbor events are fed back to the CI/CD UI for real‑time status.

Log Collection

Pods write logs to a shared path; Filebeat on each node ships logs to Kafka, GoHangout consumers persist them to Elasticsearch, and Kibana visualizes the data. The pipeline scales without performance issues.

Elastic Scaling & Self‑Healing

Metrics from Sentinel, monitoring and custom health checks drive horizontal pod autoscaling and automated recovery actions.

Network Migration

Alibaba Cloud Terway CNI provides flat networking; SLB automatically tracks pod IP changes, enabling seamless migration to Kubernetes.

Conclusion

MasterClass’s cloud‑native journey demonstrates a comprehensive, production‑grade migration from Eureka to Nacos, high‑availability sync, observability, automated release, and resilient operations, offering practical guidance for teams adopting Spring Cloud Alibaba‑based microservices on Kubernetes.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Cloud NativeKubernetesNacosservice registry
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.