Skip to main content

4.3 Embeddings & Vector Stores

AI-Generated Content

AI-generated content may contain errors. Always verify against official sources.

4.3 Embeddings & Vector Stores

Key Concepts: Embeddings · Cosine similarity · ChromaDB · FAISS

Official Docs: OpenAI Embeddings · ChromaDB Docs · FAISS GitHub


What is an Embedding?

An embedding is a fixed-size dense vector representing the semantic meaning of a piece of text. Texts with similar meaning produce vectors that are close in vector space.

from openai import OpenAI

client = OpenAI()

response = client.embeddings.create(
model="text-embedding-3-small",
input="The quick brown fox jumps over the lazy dog",
)
vector = response.data[0].embedding
print(len(vector)) # 1536 dimensions

Embedding Models

Consult the provider's official documentation for current models, dimensions, and pricing — these change over time.


The standard similarity metric for embeddings is cosine similarity:

$$\text{sim}(a, b) = \frac{a \cdot b}{|a| \cdot |b|}$$

Returns a value in $[-1, 1]$; higher = more similar.


ChromaDB (Local / Server, Open-Source)

pip install chromadb langchain-chroma langchain-openai
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings

# Build the store from documents
vector_store = Chroma.from_documents(
documents=chunks,
embedding=OpenAIEmbeddings(model="text-embedding-3-small"),
persist_directory="./chroma_db",
)

# Similarity search
results = vector_store.similarity_search(
query="What was Q3 revenue?",
k=4,
)
for doc in results:
print(doc.metadata["source"], doc.page_content[:120])

FAISS (In-Memory, by Meta)

pip install faiss-cpu langchain-community
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

vector_store = FAISS.from_documents(chunks, OpenAIEmbeddings())

# Save / load to disk
vector_store.save_local("faiss_index")
vector_store = FAISS.load_local(
"faiss_index", OpenAIEmbeddings(),
allow_dangerous_deserialization=True
)

Choosing a Vector Store

StoreBest forNotes
ChromaDBDevelopment, small-mediumOpen-source, disk-backed
FAISSFast local searchIn-memory, no server needed
PineconeManaged productionSaaS
WeaviateSelf-hosted productionOpen-source
pgvectorPostgreSQL + vectorsExtension

Key Takeaways

  • Embeddings convert text → vectors; similar text → close vectors
  • All LangChain vector stores share the same interface — swap them without changing retrieval code
  • ChromaDB is the easiest starting point for local development

Common Mistakes

Common Mistakes
  1. Mixing embedding models — you must use the same embedding model for indexing and querying. Mixing text-embedding-3-small for indexing with text-embedding-ada-002 for querying produces nonsense results.
  2. Not normalising vectors — some vector stores require L2-normalised vectors for cosine similarity. Check your vector store’s requirements.
  3. Storing the vector store in /tmp/tmp is cleared on restart. Persist your ChromaDB to a permanent directory: Chroma(persist_directory="./chroma_db").
  4. Re-embedding unchanged documents — embedding is expensive. Only re-embed documents that have changed. Track document hashes.

Quick Quiz

Test Your Understanding

Q1. What does cosine similarity measure between two embedding vectors?
A1. The angle between the vectors, representing semantic similarity. A value of 1.0 = identical meaning; 0 = unrelated; -1 = opposite meaning.

Q2. Why can you switch from ChromaDB to Pinecone without changing your LangChain retrieval code?
A2. LangChain wraps all vector stores with a common VectorStore interface. The .as_retriever() method is identical across all implementations.

Q3. What is the dimension of OpenAI’s text-embedding-3-small model output?
A3. 1536 dimensions by default (can be reduced to 512 with the dimensions parameter).


Student Exercise

Exercise 4.3 — Embedding similarity explorer
Embed 10 sentences covering 2 topics (5 about Python programming, 5 about cooking). Compute all pairwise cosine similarities. Visualise with a heatmap. Verify that within-topic similarity > cross-topic similarity.


Further Reading

Next → 4.4 RAG Pipeline Architecture