dictionary

Embeddings

Embeddings translate raw data into high-dimensional vectors. Words with similar meanings are mapped closer together in this mathematical space. This concept is the foundation of semantic search, recommendation engines, and modern RAG systems, allowing computers to understand that "puppy" and "dog" are related concepts, even though the letters are completely different.

CategoryFoundations

Reading time3 min read

Last updatedFeb 19, 2025

Definition

Mathematical representations of data (like text or images) as lists of numbers, enabling models to measure similarity and understand meaning.

Need this applied?

We help teams go from definitions to deployed workflows—safely and fast.

Get a tailored AI plan Try the AI Readiness Tool

Core mechanism

Cohere Google Developers

An embedding model takes an input (like a sentence) and outputs an array of floating-point numbers (e.g., [0.012, -0.84, 0.45...]). To find related documents, systemic evaluate the "distance" (often using cosine similarity) between vectors—the closer the vectors, the more similar the meaning.

FAQ

Do all models use the same embeddings?

No. Different embedding models (like OpenAI’s `text-embedding-3-small` vs. Cohere’s `embed-english-v3.0`) plot data in completely different spaces. You cannot compare an embedding from one model to an embedding from another.

Cohere