Artificial Intelligence 6 min read

Visualize Transformer Attention with BertViz: Install and Example Walkthrough

This guide introduces BertViz, an interactive visualization tool for transformer models such as BERT, GPT‑2 and T5, explains how to install it via pip along with required dependencies, and demonstrates head, model, and neuron view visualizations with code examples in Jupyter.

Baobao Algorithm Notes

Jan 14, 2022

Visualize Transformer Attention with BertViz: Install and Example Walkthrough

Overview

BertViz is an interactive tool for visualizing the attention mechanisms of Transformer‑based language models such as BERT, GPT‑2 and T5. It works with most Hugging Face models and can be run directly in Jupyter or Colab notebooks via a simple Python API.

Installation

Install the core package and its Jupyter dependencies: pip install bertviz Additional required packages:

pip install jupyterlab
pip install ipywidgets

Building the Model and Data

Load a BERT model and tokenizer, prepare two sentences, and obtain attention tensors.

from bertviz import head_view, model_view
from transformers import BertTokenizer, BertModel

model_version = 'bert-base-uncased'
model = BertModel.from_pretrained(model_version, output_attentions=True)
tokenizer = BertTokenizer.from_pretrained(model_version)

sentence_a = "The cat sat on the mat"
sentence_b = "The cat lay on the rug"
inputs = tokenizer.encode_plus(sentence_a, sentence_b, return_tensors='pt')
input_ids = inputs['input_ids']
token_type_ids = inputs['token_type_ids']
attention = model(input_ids, token_type_ids=token_type_ids)[-1]

sentence_b_start = token_type_ids[0].tolist().index(1)
tokens = tokenizer.convert_ids_to_tokens(input_ids[0].tolist())

Head‑View Visualization

The head view shows attention from one token to another for selected heads in a single Transformer layer. Lines encode attention weight (thickness) and head identity (color). Use the following call to launch the interactive view:

head_view(attention, tokens, sentence_b_start)

Interaction tips: hover over tokens to filter attention, click or double‑click colored blocks to select or isolate heads, and use the layer dropdown to change the displayed layer.

Model‑View Visualization

The model view provides a matrix overview of attention weights across all layers and heads. Clicking a cell opens a detailed view for that specific head.

model_view(attention, tokens, sentence_b_start)

Neuron‑View Visualization

The neuron view visualizes intermediate representations (query, key, etc.) used to compute attention. In the collapsed view, lines represent token‑to‑token attention; expanding a node reveals the underlying vectors.

from bertviz.transformers_neuron_view import BertModel, BertTokenizer
from bertviz.neuron_view import show

model_type = 'bert'
model_version = 'bert-base-uncased'
model = BertModel.from_pretrained(model_version, output_attentions=True)
tokenizer = BertTokenizer.from_pretrained(model_version, do_lower_case=True)

show(model, model_type, tokenizer, sentence_a, sentence_b, layer=4, head=3)

Interaction tips are similar to the other views: hover to filter, click plus icons to expand intermediate vectors, and use dropdowns to select layers or heads.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python Transformer NLP Attention Visualization BertViz

Written by

Baobao Algorithm Notes

Author of the BaiMian large model, offering technology and industry insights.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.