Loading
Loading
A token budget is the planned allocation of input and output tokens for a model request, often used to manage cost, latency, and context limits.
Controlling token budgets prevents prompt bloat, reduces spend, and improves response time in production systems.
A support workflow caps each model call at 2,000 total tokens to keep API costs predictable.
Tokenisation
Tokenisation is the process of breaking text into smaller units called tokens — which can be words, subwords, or characters — so that an AI model can process them numerically. Each token is mapped to a number that the model uses for computation.
Context Window
A context window is the amount of text or tokens an AI model can consider at once when generating a response. Anything outside that limit is not directly visible to the model in the current request.
Latency
Latency is the delay between a user's request and the system's response. In AI systems, latency includes model processing time plus network and infrastructure overhead.
Our programme follows a structured Level 4 curriculum with project-based learning, practical workflows, and guided implementation across business and career use cases. Funded route available for UK citizens and ILR holders.