How Browsers Turn URLs into Web Pages: Inside Rendering Engines and Parsing
From typing a URL to seeing a page, browsers perform a complex series of steps—including network requests, HTML and CSS parsing, DOM and render tree construction, layout, painting, and script execution—while handling errors and optimizations across components such as the UI, engine, networking, JavaScript interpreter, and storage.
Introduction
Browsers are the most widely used software. This article explains how a browser works, from entering google.com in the address bar to displaying the Google homepage.
Browsers Covered
Five mainstream browsers exist today: IE, Firefox, Safari, Chrome, and Opera. The discussion focuses on open‑source browsers—Firefox, Chrome, and the partially open‑source Safari.
Main Functions of a Browser
A browser retrieves web resources identified by a URI, renders them (typically HTML, but also PDF, images, etc.), and presents them in a window. HTML and CSS specifications define how browsers should interpret documents; the W3C maintains these standards.
Because vendors add proprietary extensions, strict compliance is rare, leading to compatibility challenges for web developers.
High‑Level Structure (Components)
User Interface – address bar, navigation buttons, bookmarks, refresh/stop, home button.
Browser Engine – interface to the rendering engine.
Rendering Engine – parses HTML/CSS and paints the result.
Network – platform‑independent HTTP handling.
UI Backend – draws native widgets (menus, dialogs).
JavaScript Interpreter – executes JS code.
Data Storage – persistent storage (cookies, WebSQL, IndexedDB).
Component Communication
Firefox and Chrome implement a special inter‑component communication structure, described in a dedicated chapter.
Rendering Engine
The rendering engine’s job is to display the requested content. By default it can render HTML, XML, and images, and can use plugins for other formats (e.g., PDF).
Engines Used by the Discussed Browsers
Firefox uses Gecko, an engine developed by Mozilla. Chrome and Safari both use WebKit (Safari’s version is partially open‑source).
WebKit originated on Linux and was later ported to macOS and Windows.
Main Flow of a Rendering Engine
1. Network fetches the document (often in 8 KB chunks). 2. Parse HTML → build DOM tree. 3. Parse CSS and combine with DOM to build the render tree. 4. Layout the render tree (compute coordinates). 5. Paint the render tree to the screen.
Parsing
Parsing converts a document into a structured tree (parse tree or syntax tree). For example, parsing the expression 2+3-1 yields a binary‑tree representation.
Grammars and Parsers
Parsing relies on a grammar (usually a context‑free grammar expressed in BNF). Two main parser types exist:
Top‑down parsers – start from the highest‑level rule and try to match input.
Bottom‑up parsers – build matches from the input upward (shift‑reduce).
HTML Parsing
HTML cannot be parsed with generic top‑down or bottom‑up parsers because of its tolerant nature. Browsers implement a custom tokenization algorithm that turns the input stream into tokens (start tag, end tag, attribute name/value, character data, etc.) and then a tree‑construction algorithm that builds the DOM.
During tokenization, the parser maintains a state machine (e.g., Data State, Tag Open State, Tag Name State). When a '<' is encountered, it switches to Tag Open State, reads the tag name until '>', and creates a token.
Tree construction uses a stack of open elements to handle nesting, automatically inserting missing tags (e.g., <head>) and correcting mismatched structures.
Error Tolerance
Browsers silently fix malformed HTML (e.g., stray <br> tags, misplaced <table> elements, nested forms). The fixing code is internal and invisible to the user.
if (t->isCloseTag(brTag) && m_document->inCompatMode()) { reportError(MalformedBRError); t->beginTag = true; }CSS Parsing
CSS is a context‑free grammar and can be parsed with standard parsers. Tokens are defined by regular expressions (identifiers, numbers, comments, etc.). The grammar describes rulesets, selectors, and declarations.
Script Parsing and Execution
JavaScript execution blocks document parsing unless the script is marked defer or async. Browsers may perform speculative parsing on a background thread to fetch resources while the main parser continues.
Render Tree Construction
After the DOM is built, the browser creates a render tree consisting of visible elements. Firefox calls these frames; WebKit calls them render objects. The render tree is used for layout and painting.
class RenderObject{ virtual void layout(); virtual void paint(PaintInfo); RenderStyle* style; Node* node; }Style Computation
Each render object needs computed style values. Styles come from user‑agent defaults, author stylesheets, inline styles, and presentational attributes. To avoid recomputing, browsers share style objects when possible (same tag, class, state, no IDs, etc.).
Firefox builds a rule tree and a style‑context tree; WebKit traverses declarations in cascade order (non‑important, important, etc.).
Further Reading
Source: http://www.kuqin.com/system-analysis/20120205/317831.html
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
