Cloud Native 18 min read

How Yuque Scaled from Prototype to Cloud‑Native Service with Serverless Functions

This article chronicles Yuque's evolution from a hobbyist prototype to a commercial cloud‑native knowledge platform, detailing its backend migrations, adoption of Node.js, Egg, React, serverless function compute, micro‑service splits, and the architectural lessons learned for stability, cost efficiency, and scalability.

ITPUB
ITPUB
ITPUB
How Yuque Scaled from Prototype to Cloud‑Native Service with Serverless Functions

Prototype Stage (2016)

Yuque originated in 2016 as an internal documentation tool for Ant Financial Cloud, built by an engineer in spare time using the lowest‑cost stack. The backend relied on internal BaaS services:

Object Service – a MongoDB‑like data store

File Service – a wrapper around Alibaba Cloud OSS

DockerLab – an internal container hosting platform

The application server was a single Node.js monolith using Ant's open‑source Egg framework (internal fork "Chair"). The front‑end used React, Ant Design, and CodeMirror to provide a powerful markdown editor.

Despite being an engineer’s side project, this stack proved sufficient to validate the online documentation prototype.

Internal Service Stage (2017)

In 2017 Yuque gained internal adoption and expanded its feature set: a rich‑text editor for non‑technical users, formula support, drawing, and mind‑map capabilities. The three‑layer knowledge structure (team → knowledge base → document) began to solidify.

To meet growing demands, the team replaced the BaaS layer with Alibaba Cloud IaaS services (MySQL, OSS, cache, search). The backend remained a large Egg‑based Node.js application, but the business layer adopted an ORM for clearer data‑model separation.

On the front‑end, the editor migrated from CodeMirror to Slate. The team forked Slate for deeper customization and introduced a proprietary content storage format to improve performance and compatibility.

Commercialization Stage (2018‑Present)

After a half‑year of refactoring, Yuque opened to external customers in early 2018. New challenges emerged: richer knowledge‑creation features (tables, mind‑maps, real‑time collaboration) and higher expectations for stability, security, and cost control.

Key architectural changes included:

Moving core services to Alibaba Cloud IaaS (MySQL, OSS, cache, search) for better reliability.

Continuing to use Egg for the monolithic Node.js web service while extracting independent functions into micro‑services and serverless components.

Splitting functionality into three categories:

Micro‑services (e.g., real‑time collaboration) to avoid frequent releases for long‑running connections.

Task services (e.g., heavy file preview) to isolate resource‑intensive jobs.

Function‑compute services (e.g., PlantUML, Mermaid rendering) that run in a pay‑per‑use, sandboxed environment.

The front‑end eventually moved away from Slate to a self‑developed editor built on contenteditable, Canvas for tables, and SVG for mind‑maps.

The Unbreakable Bond with Function Compute

Node.js excels at I/O‑bound workloads but struggles with CPU‑intensive or potentially blocking tasks such as complex markdown parsing, third‑party tool execution (Puppeteer, Graphviz), or video/audio transcoding with FFmpeg. These operations can cause the single‑threaded event loop to stall, jeopardizing service stability.

Alibaba Cloud Function Compute is an event‑driven, fully managed compute service. You only write code, upload it, and pay for the actual CPU time consumed; idle code incurs no cost.

By offloading CPU‑heavy jobs to Function Compute, Yuque restored its backend to an I/O‑centric model, gaining:

Pay‑per‑use billing, eliminating the need for a permanently provisioned task cluster.

Isolation of each invocation, preventing crashes or memory leaks from affecting the main service.

Sandboxed execution, mitigating security risks from malicious user input.

Example: converting user‑submitted HTML/Markdown to Yuque’s internal format. Most inputs parse quickly, but rare edge cases can trigger infinite loops in the parser. Running this conversion in Function Compute protects the main service from being blocked.

Other Function Compute use cases include:

Generating diagrams (PlantUML, Mermaid) and rendering formulas.

Transcoding video/audio files using FFmpeg, reducing costs to one‑fifth of the previous Alibaba Cloud Video Service.

Executing user‑provided code in a sandboxed container.

These scenarios share two traits: they are CPU‑intensive and either have low latency requirements or need strong isolation.

When to Use Function Compute

CPU‑heavy operations with modest time‑sensitivity, to offload pressure from the main service.

Sandboxed execution of untrusted user code.

Running unstable third‑party applications that are difficult to keep continuously alive.

Workloads requiring rapid, elastic scaling.

Current Architecture Overview

Yuque now runs a monolithic Node.js application at its core, complemented by micro‑services for independent modules and serverless functions for CPU‑bound tasks. The entire stack is hosted on Alibaba Cloud, leveraging a rich set of managed services (databases, storage, queues, search, AI services such as OCR and translation).

Conclusion – Lessons on Tech‑Stack Selection

Key takeaways from Yuque’s journey:

Align technology choices with the product’s development stage; early phases prioritize rapid iteration, while later stages demand stability and performance.

Consider the team’s expertise—Yuque’s JavaScript‑full‑stack was a natural fit for its engineers.

Regardless of language or services, prioritize security, stability, and maintainability; these fundamentals determine long‑term success.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Cloud NativeServerlessBackend ArchitectureMicroservicesNode.jsFunction Compute
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.