Tokenizer Lab

One prompt, multiple public model tokenizers, live browser-side token boundaries.

Client-side Kimi-K2.5

Model

Tokenizer workbench

The original anchor for this page: a public Qwen 3.5 checkpoint with the newer Qwen instruct serialization.

View
Plain text shows raw segmentation. Chat mode shows the model-formatted prompt and its control tokens.
Model details Qwen / Qwen 3.5 / 4B
Family vocabulary Single public configuration
Template style Qwen 3.5 instruct
Resolved repo Qwen/Qwen3.5-4B

Vocabulary Scope

Qwen 3.5 currently contributes one browser-fetchable public checkpoint here, so there is no sibling tokenizer comparison inside the family yet.

Available Here

Qwen 3.5 exposes one public configuration in this browser build: 4B.

Line Context

Qwen 2.5, Qwen 3, and Qwen 3.5 are grouped here as successive generations of one Chinese model line.

Browser Runtime

This view loads Qwen/Qwen3.5-4B directly in the browser from public tokenizer assets, so the token stream is computed client-side without a server hop.

Start Here

Paste text and watch tokens form

Replace the sample or paste your own text. The inline surface and token stream update immediately.

Text to tokenize

Prompt

This is the primary action for a new user. Replace the sample below or paste your own text and the token view will respond live.

Ready.
Live token view

Token map

The same text appears here, but grouped and colored by live token boundaries.

Metadata, exact IDs, and catalog
Exact serialized text

Serialized text

This is the exact string the tokenizer consumed for the current mode.

Token IDs

Token IDs

Quick copy/paste for debugging or evaluation workflows.

Family vocab

Single public configuration

Vocabulary relationship: Qwen 3.5 currently contributes one browser-fetchable public checkpoint here, so there is no sibling tokenizer comparison inside the family yet.

Qwen 2.5, Qwen 3, and Qwen 3.5 are grouped here as successive generations of one model line. Vocabulary badges apply within the selected family, not across the whole line.

Resolved metadata
Repository Qwen/Qwen3.5-4B
Model line Qwen
Family / generation Qwen 3.5
Configuration 4B
Vocabulary relationship Single public configuration
Template style Qwen 3.5 instruct

The original anchor for this page: a public Qwen 3.5 checkpoint with the newer Qwen instruct serialization.

Catalog
    Runtime gaps
      Token table

      Decoded surface, raw token, and flags

      # ID Decoded surface Vocabulary token Flags