4.3 Embeddings & Vector Stores
AI-generated content may contain errors. Always verify against official sources.
4.3 Embeddings & Vector Stores
Key Concepts: Embeddings · Cosine similarity · ChromaDB · FAISS
Official Docs: OpenAI Embeddings · ChromaDB Docs · FAISS GitHub
What is an Embedding?
An embedding is a fixed-size dense vector representing the semantic meaning of a piece of text. Texts with similar meaning produce vectors that are close in vector space.
from openai import OpenAI
client = OpenAI()
response = client.embeddings.create(
model="text-embedding-3-small",
input="The quick brown fox jumps over the lazy dog",
)
vector = response.data[0].embedding
print(len(vector)) # 1536 dimensions
Embedding Models
Consult the provider's official documentation for current models, dimensions, and pricing — these change over time.
Similarity Search
The standard similarity metric for embeddings is cosine similarity:
$$\text{sim}(a, b) = \frac{a \cdot b}{|a| \cdot |b|}$$
Returns a value in $[-1, 1]$; higher = more similar.
ChromaDB (Local / Server, Open-Source)
pip install chromadb langchain-chroma langchain-openai
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
# Build the store from documents
vector_store = Chroma.from_documents(
documents=chunks,
embedding=OpenAIEmbeddings(model="text-embedding-3-small"),
persist_directory="./chroma_db",
)
# Similarity search
results = vector_store.similarity_search(
query="What was Q3 revenue?",
k=4,
)
for doc in results:
print(doc.metadata["source"], doc.page_content[:120])
FAISS (In-Memory, by Meta)
pip install faiss-cpu langchain-community
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
vector_store = FAISS.from_documents(chunks, OpenAIEmbeddings())
# Save / load to disk
vector_store.save_local("faiss_index")
vector_store = FAISS.load_local(
"faiss_index", OpenAIEmbeddings(),
allow_dangerous_deserialization=True
)
Choosing a Vector Store
| Store | Best for | Notes |
|---|---|---|
| ChromaDB | Development, small-medium | Open-source, disk-backed |
| FAISS | Fast local search | In-memory, no server needed |
| Pinecone | Managed production | SaaS |
| Weaviate | Self-hosted production | Open-source |
| pgvector | PostgreSQL + vectors | Extension |
Key Takeaways
- Embeddings convert text → vectors; similar text → close vectors
- All LangChain vector stores share the same interface — swap them without changing retrieval code
- ChromaDB is the easiest starting point for local development
Common Mistakes
- Mixing embedding models — you must use the same embedding model for indexing and querying. Mixing
text-embedding-3-smallfor indexing withtext-embedding-ada-002for querying produces nonsense results. - Not normalising vectors — some vector stores require L2-normalised vectors for cosine similarity. Check your vector store’s requirements.
- Storing the vector store in
/tmp—/tmpis cleared on restart. Persist your ChromaDB to a permanent directory:Chroma(persist_directory="./chroma_db"). - Re-embedding unchanged documents — embedding is expensive. Only re-embed documents that have changed. Track document hashes.
Quick Quiz
Q1. What does cosine similarity measure between two embedding vectors?
A1. The angle between the vectors, representing semantic similarity. A value of 1.0 = identical meaning; 0 = unrelated; -1 = opposite meaning.
Q2. Why can you switch from ChromaDB to Pinecone without changing your LangChain retrieval code?
A2. LangChain wraps all vector stores with a common VectorStore interface. The .as_retriever() method is identical across all implementations.
Q3. What is the dimension of OpenAI’s text-embedding-3-small model output?
A3. 1536 dimensions by default (can be reduced to 512 with the dimensions parameter).
Student Exercise
Exercise 4.3 — Embedding similarity explorer
Embed 10 sentences covering 2 topics (5 about Python programming, 5 about cooking). Compute all pairwise cosine similarities. Visualise with a heatmap. Verify that within-topic similarity > cross-topic similarity.