Ayoob AI
AI Fundamentals

Tokenisation

The process of splitting text into smaller units (tokens) that a language model treats as atomic, typically using subword algorithms like Byte-Pair Encoding (BPE) or SentencePiece.

How it works

Language models do not see characters or words directly. They see tokens: numerical IDs for chunks of text that are typically 1 to 4 characters long. Tokenisation matters because every commercial inference cost, context window limit, and rate limit is measured in tokens. A 1,000-word document is roughly 1,300 tokens in English, but the same content in code, in a non-Latin script, or in an unusual format can produce significantly more. For enterprise deployment this drives concrete decisions: how to chunk documents for RAG (chunks need to fit in context with room for the question and the answer), how to estimate inference cost, and where to invest in prompt compression. Ayoob AI accounts for tokenisation explicitly in production systems, and the engineering blog covers the substrate at depth.

Want to see this technology in action?

Book a Discovery Call