What is Embedding?

Techniques

An embedding is a way of representing data — such as words, sentences, or images — as a list of numbers (a vector) in a continuous space. Items that are semantically similar end up close together in this space, allowing machines to understand relationships between concepts.

Why It Matters

Embeddings are the foundation of modern search, recommendation, and language understanding systems — they let AI grasp meaning rather than just matching keywords.

In practice

In a word embedding, 'king' minus 'man' plus 'woman' produces a vector very close to 'queen', showing the model has captured gender relationships.

Related Terms

Vector Database

A vector database is a specialised database designed to store, index, and search high-dimensional vectors (embeddings) efficiently. It enables fast similarity searches — finding items whose vector representations are closest to a given query.

Tokenisation

Tokenisation is the process of breaking text into smaller units called tokens — which can be words, subwords, or characters — so that an AI model can process them numerically. Each token is mapped to a number that the model uses for computation.

Natural Language Processing (NLP)

Natural language processing is a branch of AI that enables machines to read, understand, interpret, and generate human language. It bridges the gap between human communication and computer understanding, covering tasks like translation, summarisation, and sentiment analysis.

Keep learning with guided projects

Our programme follows a structured level 3-6 curriculum with project-based learning, practical workflows, and guided implementation across business and career use cases. The full programme fee is £2,999 with flexible instalment plans.

Apply Now See Pricing Options