Transformer Visualizer

Layer 1 / 12

INPUT TEXT

"The transformer model processes"

↓

TOKENS (0)

Tokenization

Text is broken into smaller units called tokens that the model can process.

What is a Token?

Tokens can be full words, subwords, or even punctuation. For example:

"empowers" → ["em", "powers"]

Vocabulary

GPT-2 uses a fixed vocabulary of:

50257 tokens

This vocabulary is learned before training and reused for all inputs.

Why Tokenization Matters

Tokens are the bridge between raw text and math — they allow language to be converted into vectors that neural networks can process.

Next Step

Each token will be mapped to an ID, which is then used to retrieve its embedding vector.