Evolution Stages of Rich Text Editors: From L0 to L2
This article outlines the evolution of rich‑text editors through L0, L1, and L2 stages, comparing their underlying APIs, data models, selection handling, and collaborative capabilities, and evaluates the advantages and drawbacks of each stage with examples such as UEditor, Quill, Slate, and Google Docs.
Development History
Rich‑text editors can be divided into three stages—L0, L1, and L2—each offering higher customization, fewer browser‑dependent issues, and increased development complexity.
L0 Stage
Early editors rely heavily on the DOM API, using contenteditable and document.execCommand . Notable examples include UEditor and CKEditor 1‑4. This stage has low technical barrier, native input fluidity, but suffers from inconsistent browser implementations, complex DOM structures, cursor placement issues, and lack of collaboration.
Advantages
Low technical threshold.
Native browser editing provides fluid input.
No complex composition input problems.
Disadvantages
Browser‑specific DOM variations for the same command (e.g., bold‑italic).
Selection representation varies, leading to ambiguous deletions.
Cursor insertion points are unpredictable.
Copy‑paste behavior is unpredictable due to arbitrary HTML.
No collaborative support.
L1 Stage
Most modern editors (Quill, CKEditor 5, Slate, Draft.js, etc.) still use contenteditable but replace document.execCommand with custom implementations and introduce an abstract data model (often called a “Modal”).
Modal – Abstract Content Model
Quill uses a Delta model consisting of retain , insert , and delete operations with optional attributes . Example:
{
"ops": [
{"insert": "A\nB "},
{"insert": "C", "attributes": {"bold": true}},
{"insert": "D"}
]
}Slate retains the DOM tree structure, representing the same content as a nested JSON array.
[
{"type":"paragraph","children":[{"text":"A"}]},
{"type":"paragraph","children":[{"text":"B "},{"text":"C","bold":true},{"text":" D"}]}
]View – Rendering the Modal
The view layer (often a React render function) converts the modal into DOM, allowing full control over output and avoiding L0’s inconsistent structures.
Selection Wrappers
Both L0 and L1 need to wrap the native Selection API. Quill represents a selection as a single range with {index, length} , while Slate uses anchor and focus objects containing path and offset .
Commands – Core of L1
L1 editors implement their own command APIs, typically via event listeners (e.g., beforeinput ) or MutationObserver to infer user actions and update the modal.
Operations – Collaborative Editing
Operations record atomic actions; they are the basis for Operational Transformation (OT). Quill’s OT implementation lives in the quill-delta library.
L2 Stage
Google Docs exemplifies L2 editors that abandon contenteditable entirely, drawing their own layout engine on canvas, achieving consistent selection, cursor rendering, and advanced features such as pagination and footnotes.
Future Directions
Google plans to move Google Docs rendering to canvas, further reducing reliance on the DOM, while browser standards continue to converge, potentially bringing some features back to native implementations.
References
MDN documentation for document.execCommand .
MDN documentation for MutationObserver .
Article on OT algorithms.
Google Docs canvas‑based rendering announcement.
TikTok Frontend Technology Team
We are the TikTok Frontend Technology Team, serving TikTok and multiple ByteDance product lines, focused on building frontend infrastructure and exploring community technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.