Containerizing the Live Classroom Service: Architecture, Migration Process, and Lessons Learned
This article details the background, goals, architectural analysis, migration scope, step‑by‑step containerization process, code‑level challenges, and post‑migration results of moving a large‑scale live‑classroom platform from virtual machines to a Kubernetes‑based container environment, highlighting performance, reliability, and operational improvements.
In recent years, micro‑service architectures have grown rapidly, but traditional virtual‑machine deployments struggle with scaling and management overhead, especially for high‑traffic live‑classroom services that require rapid expansion.
Containerization offers a lightweight alternative, reducing resource consumption and enabling faster scaling. The project aimed to migrate the entire live‑classroom platform to containers by the 2020 winter term, with a fallback to the VM environment if needed.
The existing system consists of an access layer (HTTP APIs), a service layer (micro‑services providing RPC), an infrastructure layer (Redis, MySQL, Zookeeper, Kafka, logging, publishing, gateway), and external dependencies (course, material, OA, user systems). The migration focused on three major areas: stateful services, persistent file storage, and auxiliary processes.
Key migration steps included:
Redesigning TW node service discovery by consolidating agents into a centralized cluster.
Replacing file‑based message‑queue fallback with Redis for reliability.
Adopting hostPath volumes for log persistence and adding random suffixes to log files to avoid overwrites.
Implementing Kubernetes storage options (emptyDir, hostPath, ConfigMap, Secret) for various needs.
Updating service registration and discovery to use Zookeeper with a fallback to Kubernetes Services, illustrated by code changes.
Code example for Zookeeper watch implementation:
func (s *Zookeeper) WatchTree(directory string, stopCh <-chan struct{}) (<-chan []*store.KVPair, error) {
entries, err := s.List(directory)
if err != nil {
return nil, err
}
watchCh := make(chan []*store.KVPair)
go func() {
defer close(watchCh)
watchCh <- entries
for {
_, _, eventCh, err := s.client.ChildrenW(s.normalize(directory))
if err != nil {
return
}
select {
case e := <-eventCh:
if e.Type == zk.EventNodeChildrenChanged {
if kv, err := s.List(directory); err == nil {
watchCh <- kv
}
}
case <-stopCh:
return
}
}
}()
return watchCh, nil
}Additional code for listing Zookeeper keys:
func (s *Zookeeper) List(directory string) ([]*store.KVPair, error) {
keys, stat, err := s.client.Children(s.normalize(directory))
if err != nil {
if err == zk.ErrNoNode {
return nil, store.ErrKeyNotFound
}
return nil, err
}
kv := []*store.KVPair{}
for _, key := range keys {
pair, err := s.Get(strings.TrimSuffix(directory, "/") + s.normalize(key))
if err != nil {
if err == store.ErrKeyNotFound {
return s.List(directory)
}
return nil, err
}
kv = append(kv, &store.KVPair{Key: key, Value: []byte(pair.Value), LastIndex: uint64(stat.Version)})
}
return kv, nil
}Testing revealed event‑loss issues during rapid pod scaling, which were resolved by resetting the Zookeeper watch after each event, as demonstrated in the updated design diagrams.
The migration also covered asynchronous consumer services, scheduled tasks (replacing XXL‑Job with a custom Go‑based scheduler supporting second‑level granularity), and gray‑release strategies with one‑click rollback.
After completing the migration, the live‑classroom platform operated fully in the container environment, handling millions of concurrent students during the 2020 winter term, with plans for further cloud integration, dynamic scaling, and service mesh adoption.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Xueersi Online School Tech Team
The Xueersi Online School Tech Team, dedicated to innovating and promoting internet education technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
