How to Build Low‑Cost Real‑Time Collaborative Editing: From Edit Locks to Distributed OT
This article surveys practical approaches for implementing real‑time collaborative editing, comparing simple edit‑lock methods, line‑based diff/patch techniques, character‑level Myer's diff, and Operational Transformation—including distributed challenges—and offers recommendations based on project size and performance needs.
What Is Real‑Time Collaborative Editing
Real‑time collaborative editing allows multiple users to edit the same document simultaneously, as seen in Google Docs, where changes appear instantly without manual refresh.
Implementing it requires solving two technical problems: real‑time communication (easily handled with long‑polling or WebSocket) and edit‑conflict resolution, which is the focus of this article.
Possible Solutions
The following methods are presented from simplest to most complex: edit lock, GNU diff‑patch, Myer's diff‑patch, Operational Transformation (OT), and distributed OT.
Edit Lock
An edit lock prevents concurrent edits by locking the document when a user is editing it. This approach is simple and widely used (e.g., in internal TWiki systems) but offers poor user experience and no true real‑time capability.
GNU diff‑patch
Version‑control tools like Git use diff‑patch to merge changes. The process can be reproduced in JavaScript:
When a user joins, establish a long‑running connection and store a local copy of the document.
After a pause (e.g., 5 seconds), compute a diff between the current document and the stored copy, then send the diff to the server.
The server updates the master document and pushes the diff to other connected users, who apply it with patch.
Because GNU diff works line‑by‑line, simultaneous edits on the same line cause conflicts. The article shows examples with the texts “百度 Web” and “百度 Web 前端” and illustrates how patch creates a .rej file for unresolved conflicts.
Myer's diff‑patch
Myer's algorithm performs character‑level diff, reducing line‑based conflicts. Sample output demonstrates how it identifies insertions and deletions with precise positions. Experiments show it resolves most conflicts, though it can lose characters when multiple users edit the same position or when handling rich‑text markup.
Open‑source libraries such as changesets implement this algorithm and are used by editors like Codebox.
Operational Transformation (OT)
OT is the technique employed by Google Docs. It converts edits into three primitive operations: retain(n): keep n characters unchanged. insert(str): insert the string str . delete(str): delete the string str .
When two users edit concurrently, their operation sequences are transformed so that later operations adapt to earlier ones. The article walks through an example where user A changes “百度 Web” to “Web 前端” and user B changes it to “百度 FE”, showing how B’s operations must be transformed to avoid loss of data.
Using the open‑source changesets library, the author demonstrates that OT can achieve higher accuracy than Myer's diff, though the library’s implementation may still have quirks.
Distributed Operational Transformation
Scaling OT to multiple servers introduces three major challenges:
Ordering problem : Asynchronous requests can arrive out of order, causing divergent document states. A simple solution is to serialize requests with queues on both client and server.
Atomic storage : Concurrent writes to a shared database can overwrite each other. Solutions include single‑threaded servers, database transactions, or distributed locks (e.g., ZooKeeper), each with trade‑offs.
Version management : Different servers may operate on different document versions. By storing a version number with each edit and using diff to compute the necessary transformation, servers can reconcile divergent edits. The article illustrates this with version numbers v=1 → v=2 and shows how to adjust operations accordingly.
Storing every version can be costly; optimizations such as keeping only recent snapshots are possible. Existing frameworks like ShareJS and OpenCoweb provide reference implementations.
Preliminary Conclusions
If you have a small internal project with low real‑time requirements but high accuracy needs, use merge or diff3 and let users resolve line conflicts manually.
If you need moderate real‑time performance, low traffic, and can tolerate occasional conflicts, Myer's diff with a single Node process is a good fit.
If you require real‑time collaboration across multiple backend servers, consider OT or Myer's diff, but be prepared to handle distributed ordering, atomicity, and versioning issues.
For fine‑grained control over rich‑text or non‑plain‑text formats, implement a custom OT solution that can merge complex structures (e.g., XML).
Further Topics
Displaying other users' selections (cursor sharing) and handling related merge conflicts.
Keeping cursor positions in sync as the text changes.
Implementing undo functionality in a collaborative environment.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Baidu Tech Salon
Baidu Tech Salon, organized by Baidu's Technology Management Department, is a monthly offline event that shares cutting‑edge tech trends from Baidu and the industry, providing a free platform for mid‑to‑senior engineers to exchange ideas.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
