How LLMs Work

At their core, LLMs are statistical prediction engines.

They don’t think, reason, or “know” things like humans. They work by breaking down inputs into tokens. Tokens are numbers that represents parts of words or whole words. This is because LLMs work entirely with probabilities, and numbers make it possible to calculate which outcomes are most likely. It also accounts for some interesting behaviours in outputs.

The model analyses the surrounding context, looking at the tokens that came before. Then, one token at a time, it predicts what’s most likely to come next. It’s not guessing. It rapidly ranks all possible next tokens and selects the most likely one, based on patterns learned during training. Because outputs are non-deterministic, they can vary, even with the same input.

info

Like a supercharged predictive text, but instead of just a word, it continues writing paragraphs based on everything it has seen before.

These predictions are based on statistical patterns in data, not true understanding. That’s why LLMs can sometimes generate convincing but incorrect information. This is called AI hallucination and it is why human oversight is essential.