Fundamentals 37 min read

How Browsers Work: Architecture, Rendering Engine, Parsing, and Rendering Process

This article explains the internal architecture of modern browsers, describing their main components, high‑level structure, communication mechanisms, rendering engine workflow, HTML/CSS/JavaScript parsing, error‑tolerance strategies, and style computation, using examples from Firefox, Chrome, and Safari.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
How Browsers Work: Architecture, Rendering Engine, Parsing, and Rendering Process

Browsers are the most widely used software, and this article explains how they work from the moment a user types a URL until the page is displayed. It first lists the five mainstream browsers (IE, Firefox, Safari, Chrome, Opera) and notes that open‑source browsers (Firefox, Chrome, Safari) now dominate the market.

Main functions of a browser include requesting web resources via URI, interpreting HTML/CSS, and rendering the result. The UI consists of address bar, navigation buttons, bookmarks, refresh/pause, and home button, though no formal specification defines these elements.

High‑level structure comprises the user interface, browser engine, rendering engine, network layer, UI backend, JavaScript interpreter, and data storage. Chrome assigns a separate rendering engine instance to each tab, making each tab an independent process.

Component communication is handled by special messaging structures in Firefox and Chrome, discussed in later chapters.

Rendering engine responsibilities are to render requested content. Both WebKit (used by Chrome and Safari) and Gecko (used by Firefox) follow a similar pipeline: fetch content, parse HTML to build a DOM tree, construct a render tree, layout the render tree, and paint it. The process is incremental, displaying parts of the page as soon as they are parsed.

Parsing transforms source documents into syntax trees. The article covers lexical analysis, syntax analysis, grammar definitions (context‑free grammars for HTML, CSS, and JavaScript), and parser generators (Flex, Bison). It shows a simple arithmetic expression parser example and explains top‑down vs. bottom‑up parsing, including shift‑reduce parsers.

HTML parsing uses a tokenization algorithm (state machine) to produce tokens such as start tags, end tags, and character data, followed by a tree‑construction algorithm that builds the DOM tree, handling implicit elements (e.g., head ) and error‑tolerance. The article includes several code snippets illustrating error handling and tag correction:

if (t->isCloseTag(brTag) && m_document->inCompatMode()) {
    reportError(MalformedBRError);
    t->beginTag = true;
}
if (m_inStrayTableContent && localName == tableTag)
    popBlock(tableTag);
if (!m_currentFormElement) {
    m_currentFormElement = new HTMLFormElement(formTag, m_document);
}
bool HTMLParser::allowNestedRedundantTag(const AtomicString& tagName) {
    unsigned i = 0;
    for (HTMLStackElem* curr = m_blockStack;
         i < cMaxRedundantTagDepth && curr && curr->tagName == tagName;
         curr = curr->next, i++) { }
    return i != cMaxRedundantTagDepth;
}
if (t->tagName == htmlTag || t->tagName == bodyTag )
    return;

Render objects are created based on the CSS display property. Example code from WebKit shows object creation:

class RenderObject{
    virtual void layout();
    virtual void paint(PaintInfo);
    virtual void rect repaintRect();
    Node* node; // the DOM node
    RenderStyle* style; // the computed style
    RenderLayer* containgLayer; // the containing z-index layer
}
RenderObject* RenderObject::createObject(Node* node, RenderStyle* style) {
    Document* doc = node->document();
    RenderArena* arena = doc->renderArena();
    RenderObject* o = 0;
    switch (style->display()) {
        case NONE: break;
        case INLINE: o = new (arena) RenderInline(node); break;
        case BLOCK: o = new (arena) RenderBlock(node); break;
        case INLINE_BLOCK: o = new (arena) RenderBlock(node); break;
        case LIST_ITEM: o = new (arena) RenderListItem(node); break;
        // ...
    }
    return o;
}

The render tree corresponds to the DOM tree but is not a one‑to‑one mapping; invisible nodes (e.g., head ) and display:none elements are omitted, while some elements generate multiple render objects (e.g., select ).

Style computation gathers style information from user agent stylesheets, author stylesheets, inline styles, and HTML visual attributes. The article discusses rule trees (Firefox) and style context trees, caching of style data, specificity calculation, and cascade order (browser < → user < → author < → important). It also explains how browsers optimize rule matching using hash maps for IDs, classes, and tag names.

Finally, the article touches on script parsing , the synchronous execution model, defer/async attributes, speculative parsing for parallel resource loading, and the interaction between scripts and style sheets.

JavaScriptwebParsingbrowserCSSHTMLrendering-engine
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.