Next-Token Probability

Temperature Playground

Prompt: “The best way to learn a new language is to”

Temperature: 0.70

0 (Deterministic)1 (More Random)

How Temperature Works in LLMs

Temperature controls how deterministic or random the token selection is in language models.

Low Temperature (near 0): Model becomes very deterministic, always picking the highest probability token.
High Temperature (near 1): Model distributes probability more evenly across all possible tokens, introducing more randomness.

Downloading GPT model... (~85MB, slow on first visit)

prompt

▍

generated tokens: 0

temperature: 0.7

sampling mode

next token distribution (top 8)

Run sample to see the distribution.

💡Note: The model doesn't always pick the highest-probability token. Crank up temperature for a flatter distribution. When the EOS token is sampled, generation ends naturally (end_turn).

Remixed by Bora Lee · Based on the original Tiktokenizer

What is token generation?

Temperature Playground

How Temperature Works in LLMs