Loading
Loading
An embedding is a way of representing data — such as words, sentences, or images — as a list of numbers (a vector) in a continuous space. Items that are semantically similar end up close together in this space, allowing machines to understand relationships between concepts.
Embeddings are the foundation of modern search, recommendation, and language understanding systems — they let AI grasp meaning rather than just matching keywords.
In a word embedding, 'king' minus 'man' plus 'woman' produces a vector very close to 'queen', showing the model has captured gender relationships.
Vector Database
A vector database is a specialised database designed to store, index, and search high-dimensional vectors (embeddings) efficiently. It enables fast similarity searches — finding items whose vector representations are closest to a given query.
Tokenisation
Tokenisation is the process of breaking text into smaller units called tokens — which can be words, subwords, or characters — so that an AI model can process them numerically. Each token is mapped to a number that the model uses for computation.
Natural Language Processing (NLP)
Natural language processing is a branch of AI that enables machines to read, understand, interpret, and generate human language. It bridges the gap between human communication and computer understanding, covering tasks like translation, summarisation, and sentiment analysis.
Our programme follows a structured Level 4 curriculum with project-based learning, practical workflows, and guided implementation across business and career use cases. Funded route available for UK citizens and ILR holders.