How Operational Transformation Powers Real-Time Collaborative Editing
This article explains the core principles of Operational Transformation (OT) for real-time collaborative document editing, compares it with diff‑patch approaches, demonstrates JavaScript implementations, and details how OT handles concurrency, version control, and state management in client‑server architectures.
Operational Transformation Overview
Preface
When building an online document editor for the frontend, several key problems must be solved:
Basic text editing functionality requires a document model to describe the document.
A rich‑text editor that provides editing and rendering capabilities.
Collaboration so that different users see the same document state.
A collaboration network model to keep the server and client document models consistent.
Terminology
OT : an algorithm that solves collaboration problems.
OP : short for operation, a single operation in OT.
etherpad : an open‑source library that implements document collaboration.
easysync : the core OT algorithm used in etherpad for text collaboration.
ot‑json : an OT variant for structured data.
Changeset : a data format that describes a document change.
ClientVars : the initialization data of a document, usually a series of changesets.
Symbol Explanation
|: move cursor. ·: overlay.
Main Content
OT Algorithm
What is the OT algorithm? The simplest brute‑force approach to multi‑user editing is a document lock, which discards concurrent edits and provides a terrible user experience.
Editing Lock
If user A locks the document on the server, user B’s edits are discarded. Many wiki systems use this naive method, leading to lost content and no real‑time updates.
Linux diff‑patch
Linux provides diff and patch. A JavaScript implementation could work as follows:
User opens the document, establishes a long‑lived connection, and saves a local copy.
When the user pauses (e.g., 3 seconds), the client diffs the current document against the copy and sends the diff to the server, updating the copy.
The server applies the diff and pushes the result to other users, who use patch to update their local copy.
# 本地文档
$ echo '复仇者联盟
钢铁侠
美国队长' > test-local.txt
# 生成用户A编辑后的文档
$ echo '复仇者联盟
钢铁侠
绿巨人' > test-userA.txt
# diff两个文档
$ diff test-local.txt test-userA.txt > diff-test.patch
# 查看diff-test.patch内容
$ cat diff-test.patch
3c3
< 美国队长
---
> 绿巨人Applying the patch to user B’s document yields the merged result.
# 生成用户B编辑的文档
$ echo '复仇者联盟
黑寡妇
美国队长' > test-userB.txt
# patch方法更新文档
$ patch test-userB.txt < diff-test.patch
$ cat test-userB.txt
复仇者联盟
黑寡妇
绿巨人This line‑based diff works but fails when edits occur on the same line, causing conflicts.
diff‑match‑patch
The diff‑match‑patch library performs character‑level diffs, avoiding many line‑based issues. Example usage:
// 示例1
const localText = '复仇者联盟';
const userAText = '复仇者联盟钢铁侠';
const userBText = '复仇者联盟美国队长';
// 结果为:复仇者联盟钢铁侠美国队长However, character‑level diffs can still drop characters in certain cases, which is unacceptable for rich‑text documents.
Operation Transformation (OT)
OT solves the character‑loss problem. Using ot.js:
const str = '复仇者 Iron Man';
const operation0 = new ot.TextOperation().delete('复仇者 ').retain(8).insert(' 钢铁侠');
const operation1 = new ot.TextOperation().retain(4).delete('Iron Man').insert('Captain');
const op = ot.TextOperation.transform(operation0, operation1);
// 结果:Captain 钢铁侠OT defines three atomic operations:
insert (add characters)
delete (remove characters)
retain (move cursor / keep characters unchanged)
Note: diff‑match‑patch also uses insert, delete, and equal (unchanged) operations.
Insert
const str = '';
const operation = new ot.TextOperation().insert('复仇者联盟');
const result = operation.apply(str);
console.log(result); // 复仇者联盟Retain
const str = '复仇者联盟';
const operation = new ot.TextOperation().retain(5).insert('钢铁侠');
const result = operation.apply(str);
console.log(result); // 复仇者联盟钢铁侠Delete
const str = '复仇者联盟钢铁侠';
const operation = new ot.TextOperation().retain(5).delete('钢铁侠');
const result = operation.apply(str);
console.log(result); // 复仇者联盟Transform
Transform merges concurrent operations. Example 1 merges two independent insertions:
const str = ' ';
const op0 = new ot.TextOperation().insert('钢铁侠');
const op1 = new ot.TextOperation().insert('雷神');
const [opA, opB] = ot.TextOperation.transform(op0, op1);
console.log(opA.apply(op1.apply(str))); // 钢铁侠雷神More complex examples demonstrate how OT resolves overlapping edits while preserving all characters.
State Control
OT clients maintain three states:
Synchronized : no pending operations.
AwaitingConfirm : an operation has been sent and is awaiting server acknowledgment.
AwaitingWithBuffer : an operation is pending and new local edits have been buffered.
When the server pushes a new operation, the client performs one or two transforms and possibly a compose to integrate the change.
Client‑Server Model
Clients send changesets to the server, which assigns version numbers and broadcasts transformed changesets to other clients. The server’s workflow includes:
Receiving a changeset and locating its base version.
Transforming the incoming changeset against any intermediate changesets.
Broadcasting the transformed changeset to other clients.
Sending an ACK to the originating client.
Storing the changeset and updating the document version.
Easysync and Etherpad
Easysync is an OT‑based algorithm used by Etherpad. It represents documents with a clientVars structure containing text and attribs. Changesets describe modifications using symbols such as Z, :, >, <, |, *, +, -, and =, which map to OT’s insert, delete, and retain operations.
Example changeset for inserting "美国队长": Z:c>4|1=7=4*0+4$ 美国队长 Changesets can be composed when sequential, or merged (using merge) when concurrent. The follow method performs the OT transform on the server side.
Etherpad‑lite, the Node.js version of Etherpad, uses easysync for real‑time collaborative editing and is open‑source under the Apache license.
Conclusion
Operational Transformation remains the dominant algorithm for collaborative text editing in products like Google Docs, Tencent Docs, and Feishu Docs, despite the emergence of CRDTs for peer‑to‑peer scenarios.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
