From Terminal to Block Suites: Evolution of Web Editors and Their Core Technologies
This article traces the history of editor technology from early terminal editors to modern web‑based and block‑style editors, examines key architectural components such as contentEditable, model‑view‑controller design, collaborative algorithms, and highlights future directions like multimodal GPT interaction.
🙋🏻♀️ Editor author is a former Ant Group front‑end engineer; reading this article reveals the history of editor technology and key module solutions.
1. Introduction
Editors are one of the most complex human‑computer interaction scenarios, with the core requirement of WYSIWYG.
Key pain points are the growing complexity of WYSIWYG‑derived requirements and increasing implementation difficulty.
2. Development of Editor Technology
From early terminal editors to modern web editors, the evolution is driven by the need for WYSIWYG.
Terminal Editors
Before GUIs, terminals were the earliest editors, interacting via TTY or virtual consoles.
GUI introduced terminal windows, still widely used.
Traditional Text/Rich Text Editors
GUI‑based editors emerged, focusing on text modification, find/replace, undo/redo, and highlighting. Browsers also support simple editors via input and textarea.
Rich‑text editors like Word appeared to meet image and chart layout needs; Word for Mac (1985) already offered WYSIWYG.
Web Editors Based on Browser ContentEditable
When browsers supported contentEditable, web editors flourished, needing to handle HTML parsing and data formats.
Representative open‑source projects include:
CKEditor (2008‑present)
Enables editing by opening contentEditable.
Uses native execCommand for most operations.
Extends execCommand with custom commands.
Outputs raw HTML, applying DTD constraints and content‑filter rules.
CKEditor 5 abstracts DOM tree operations and adds collaborative editing.
KissyEditor (2010)
Built on Kissy’s modular system.
Adopts CKEditor’s command system and custom HTML DTD.
Implements its own Select and Range compatibility.
Quill (2012‑present)
Quill separates view and model, describing changes with a Delta (JSON‑like) format.
Uses contentEditable for basic editing.
Abstracts DOM and data modifications.
Delta changes are observed via MutationObserver.
Custom operations directly update the Delta.
Quill’s design inspired many subsequent editor frameworks.
ProseMirror (2015‑present)
Pure JSON data description.
Redefines schema for flexible rendering.
Introduces a virtual DOM to translate user actions into data operations.
Uses immutable data for a unidirectional data flow.
Draftjs (2015‑2020)
Deeply integrates with React.
Opens contentEditable but handles input via custom logic.
Defines block‑based schema and plugins.
Faced issues with block nesting and HTML parsing.
Slatejs (2016‑present)
Editor framework that adopts block‑based schema, immutable data, and a Transform command chain.
Provides plugin mechanisms and JSX‑based data creation.
Continuously refines schema, eventually removing it in favor of normalizeNode.
BlockSuite (2022‑present)
Inspired by Notion’s block‑style editing, BlockSuite uses Yjs CRDT to manage a typed block tree.
Leverages Yjs Shared Types for collaborative editing.
View layer can be implemented with Web Components or Canvas.
Supports local persistence and flexible rendering.
Beyond Browser‑Provided Editing
Some solutions avoid contentEditable entirely, implementing custom selection, cursor, and layout engines, sometimes using Canvas.
Custom DOM‑based editors (e.g., Youdao Cloud Note).
Hybrid approaches like early Google Docs.
Pure Canvas implementations (e.g., modern Google Docs, Tencent Docs).
Key Technical Areas
Architecture
Most editors follow an MVC pattern.
Model
ViewModel (Schema) maps data to UI rendering.
DataModel handles file loading, caching, and export.
View
Renders the editable UI, either via native HTML, framework‑driven virtual DOM, or Canvas.
Controller
Intercepts events, updates the model, and triggers view changes, often using MutationObserver, custom commands, or framework state management.
contentEditable Consistency
Different browsers implement HTML parsing inconsistently, causing unpredictable results for nested tags, cursor placement, and font rendering.
Rendering Choices
Using native HTML offers performance but limited control; framework‑driven rendering adds flexibility but may incur re‑render costs.
Input Events
Default browser behaviors for selection, cursor, and IME are inconsistent.
Solutions include beforeinput, composition events, and MutationObserver.
Selection and Cursor
Editors manipulate Selection and Range objects; custom handling is needed for multi‑range or block‑level selections.
Canvas‑Based Editing
Canvas eliminates DOM constraints, requiring custom implementations for selection, layout, input (often using hidden input), and partial rendering.
Block‑Style Editing
Blocks isolate editable regions, simplifying control but complicating inline media handling; Notion and BlockSuite exemplify this approach.
Collaboration
Real‑time multi‑user editing relies on OT (Operational Transformation) or CRDT (Conflict‑free Replicated Data Types); BlockSuite uses Yjs CRDT.
4. Outlook
The article outlines the historical development and core challenges of editors; future directions may involve multimodal GPT‑driven interaction, domain‑specific optimizations, code editors, graphic editors, general document editors, and specialized industry editors.
Alipay Experience Technology
Exploring ultimate user experience and best engineering practices
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
