How DingTalk Docs Enables Real-Time Collaborative Rich‑Text Editing Without contentEditable

This article explains how DingTalk Docs implements professional layout, custom rendering, and Operational Transformation to support complex rich‑text features and seamless multi‑user collaboration, all without relying on the native contentEditable API.

Alibaba Terminal Technology
Alibaba Terminal Technology
Alibaba Terminal Technology
How DingTalk Docs Enables Real-Time Collaborative Rich‑Text Editing Without contentEditable

DingTalk Docs, a core component of Alibaba's DingTalk office suite, has evolved over three years into a highly complex product that supports professional layout (pagination, columns, mixed text‑image) and innovative features such as embedded mind maps and maps, while also handling real‑time collaborative editing.

What You See Is What You Get

The editor provides WYSIWYG capabilities with advanced layout support. For example, line breaking is essential when a paragraph spans multiple pages; the engine measures characters to determine where to split lines.

1. Measurement and Splitting

User‑entered content lacks layout information. The layout engine measures each character, splits paragraphs, and produces a view model that is then rendered into the final DOM.

Character measurement is performed for every glyph to calculate line breaks based on container width.

2. Measurement Result Caching

Because measuring each character can be costly, DingTalk Docs caches results using a character+style key. For Chinese characters, all glyphs are replaced with a single placeholder "中" to improve cache efficiency.

3. Splitting and Mapping

Each node in the view model receives a unique identifier (e.g., paragraph-1). When a node is split, the identifier becomes originalId-splitIndex, enabling the editor to map user interactions back to the original document model.

4. Summary of the Editing Data Flow

Editor Without contentEditable

Instead of using the native contentEditable API, DingTalk Docs implements its own selection calculation, rendering, and input handling to support layout‑aware editing.

1. Selection Calculation and Rendering

When a user clicks, the editor determines the cursor position by examining the event coordinates and performing a binary search on characters. Example pseudo‑code:

const { target, clientX, clientY } = event;
if (target is a void node like image or video) { select the node }
else if (target is a text node) { binary search to find character index }
else { adjust clientX/clientY, recurse into child nodes }

The resulting selection is described as:

Value.create({
  selection: Selection.create({
    anchor: Point.create({ key: 'Good', offset: 4 }),
    focus: Point.create({ key: 'Good', offset: 4 })
  })
});

2. Input Composition

A hidden textarea captures user input and IME composition. During composition, the state is represented as: Value.create({ composing: 'hai' }); After the user confirms the text, the document model updates:

Value.create({
  document: Document.create({
    nodes: [Paragraph.create({ nodes: [Text.create('嗨')] })]
  })
});

3. Final Rendering Structure

The editor renders the combined value:

<editor>
  <content {value.document + value.composing} />
  <selection {value.selection} />
</editor>

Multi‑User Collaborative Editing

DingTalk Docs supports real‑time collaboration using Operational Transformation (OT). Edits are transformed into atomic operations; the engine resolves conflicts similarly to Git rebase.

The editor converts user actions into nine atomic operation types, which drive the document model updates.

Operations are sent to the collaborative engine, transformed, and applied to keep all participants in sync.

Open‑Source Plan

The editor SDK was designed for reuse across more than 20 products (e.g., ATA, Aliway) and will be open‑sourced to encourage community contributions.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

rich-text-editorLayout Enginecollaborative editingoperational transformationDingTalk
Alibaba Terminal Technology
Written by

Alibaba Terminal Technology

Official public account of Alibaba Terminal

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.