Frontend Development 18 min read

How Open WebUI Builds a Production‑Grade AI Chat UI with Svelte

This article dissects the open‑source Open WebUI project, explaining its three‑layer architecture, key Svelte components, streaming rendering pipeline, message‑tree data model, state management, RAG integration, and plugin system, while offering practical optimization tips for building a performant production‑level AI chat interface.

Sohu Tech Products

Feb 25, 2026

How Open WebUI Builds a Production‑Grade AI Chat UI with Svelte

Why Build a Production‑Grade AI Chat UI?

Creating a simple chat UI is easy, but delivering a production‑ready AI conversation experience requires handling streaming rendering, markdown and code highlighting, multi‑model switching, session management, knowledge‑base integration, and a flexible plugin system.

Overall Architecture

Open WebUI follows a three‑layer architecture:

Frontend layer: Svelte 5 + SvelteKit

Backend layer: Python + FastAPI (REST API & WebSocket)

Data layer: SQLite / PostgreSQL for relational data and a vector database for RAG embeddings

Key frontend tech stack:

Svelte 5 + SvelteKit (framework)

TypeScript (type safety)

Vite (build tool)

Tailwind CSS 4 (styling)

Tiptap (rich‑text editor based on ProseMirror)

CodeMirror (code editor)

KaTeX (math formula rendering)

Key Front‑End Components

1. Sidebar (Session List)

Manages all conversations, supports folder grouping, search, pinning, archiving, deletion, and drag‑and‑drop ordering. Each folder can bind a system prompt and knowledge‑base IDs, which are automatically inherited by new chats.

2. ModelSelector (Model Switcher)

Allows selecting one or two models for arena comparison, displays model tags, descriptions, capabilities, custom avatars, and filters the list based on user roles.

3. ChatWindow (Message Tree)

Renders a hierarchical message tree where each message has id and parentId. Features include regeneration (adds a sibling node), branching, and metadata (model, latency, token count).

4. MessageInput (Rich Text Input)

Implemented with Tiptap, it supports markdown shortcuts, @‑mentions, #‑document references, $‑tool calls, file drag‑drop, multi‑line input, and voice input.

5. ResponseMessage (Message Bubble)

Handles real‑time streaming output, markdown‑to‑HTML conversion, syntax highlighting, KaTeX rendering, action buttons (copy, regenerate, edit, like/dislike), and source citation for RAG results.

Streaming Rendering Pipeline

The pipeline follows Receive → Concatenate → Parse → Render → Update DOM . Tokens arrive via fetch (ReadableStream) or WebSocket, are appended to the current content string, parsed with marked, highlighted with highlight.js, rendered with KaTeX, and finally injected into the DOM.

Performance pitfall: each token triggers a full markdown parse, highlight, and KaTeX render on the entire content, causing UI jank for long responses.

Optimization Ideas

Throttled rendering: batch tokens and update the UI at a fixed interval (e.g., every 50 ms) using requestAnimationFrame.

let buffer = ''
let rafId = null
function onToken(token) {
  buffer += token
  if (!rafId) {
    rafId = requestAnimationFrame(() => {
      updateContent(buffer)
      buffer = ''
      rafId = null
    })
  }
}

Incremental parsing: render completed blocks once and cache the result; only the currently streaming block is re‑parsed.

Completed paragraph 1 ← render once, cache
Completed code block ← render once, cache
Streaming paragraph ← re‑render only this part

Deferred highlighting & formula rendering: postpone highlight.js and KaTeX until the stream ends.

Virtual scrolling: render only visible messages when a conversation contains dozens or hundreds of entries.

State Management with Svelte Store

Open WebUI uses Svelte’s built‑in writable stores instead of external libraries. Major stores include: config: application configuration (backend URL, feature flags) models: list of available LLMs user: current user information chats: session list settings: user preferences WEBUI_NAME: application name

Global stores hold data shared across pages, while component‑local reactive variables manage the current chat’s messages, input content, and UI toggles.

Message Tree Data Model

The core fields are parentId, childrenIds, and siblingIdx. Rendering starts from the root node and follows the currently selected branch, constructing the visible message sequence. Switching branches only requires updating siblingIdx to point to a different child.

RAG Integration (Knowledge Base)

Frontend responsibilities are limited to file upload (PDF, TXT, Markdown) and showing progress. When a user types # in the input, a document picker appears; the selected document IDs are sent with the message payload, and the backend performs vector similarity search and injects relevant snippets into the prompt. The UI also displays citation links for each referenced source.

Plugin System (Tools, Functions, Pipelines)

Open WebUI exposes a three‑layer plugin mechanism:

Tools: external capabilities (weather, stock, web search) displayed in the UI when invoked.

Functions: custom backend‑side extensions written in Python, configurable via the admin UI.

Pipelines: hook points in the message flow (e.g., profanity filter, auto‑translation) driven by JSON‑Schema metadata, allowing the frontend to generate dynamic forms without code changes.

What to Learn and What Not to Copy

Learn:

Message‑tree structure (parentId + childrenIds) for branching.

Streaming rendering pipeline and its optimization patterns.

Rich‑text input with Tiptap instead of building a textarea from scratch.

Configuration‑driven plugin architecture (JSON‑Schema forms).

Multi‑model selector supporting arena mode.

Don’t copy blindly:

The choice of Svelte is not mandatory; React or Vue work equally well.

You don’t need to implement every feature at once—start with streaming, model switching, and session management.

Accept that the original implementation has performance issues; apply the incremental parsing and throttling techniques described above.

Conclusion

Building a usable AI chat UI is straightforward, but delivering a smooth, feature‑rich experience is challenging. The main hurdles are streaming rendering performance, a tree‑based message model, complex rich‑text input, and an extensible plugin system. Open WebUI already solves most of these problems, providing a solid reference for developers to adopt its architectural ideas and implement them with their preferred tech stack.

Frontend Architecture Svelte Open WebUI streaming rendering AI chat UI message tree

Written by

Sohu Tech Products

A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.