Why are Vector Databases Used in RAG? (Interview-Focused)
In Retrieval-Augmented Generation systems, vector databases are used to store and retrieve embeddings efficiently so the LLM can access relevant external knowledge during response generation.
Simple Interview Definition
A vector database stores data as vector embeddings and helps retrieve the most semantically similar information for a query.
In RAG, this allows the system to find relevant documents even when the exact keywords are not present.
Why Normal Databases Are Not Enough
Traditional SQL or keyword search works well for:
-
Exact matches
-
Structured data
-
Keyword filtering
But RAG needs:
-
Semantic similarity
-
Meaning-based search
-
Fast nearest-neighbor retrieval over millions of embeddings
Example:
User asks:
"How do I reset my password?"
A keyword database may miss a document titled:
"Account credential recovery steps"
But embeddings capture semantic meaning, so vector search can still retrieve it.
How Vector Databases Work in RAG
Step-by-Step Flow
1. Convert documents into embeddings
Documents are converted into vectors using embedding models.
Example:
-
"Cats are animals" →
[0.21, -0.77, ...] -
"Dogs are pets" →
[0.18, -0.72, ...]
These vectors represent semantic meaning.
2. Store embeddings in vector DB
The embeddings are stored in vector databases like:
-
Pinecone
-
Weaviate
-
Milvus
-
Qdrant
-
Chroma
-
MongoDB
-
Azure Cosmos DB with vector search
3. User query is converted into embedding
Question:
"How can I recover my account?"
becomes a vector embedding.
4. Similarity search happens
The vector DB performs:
-
Cosine similarity
-
Euclidean distance
-
Dot product search
to find the closest embeddings.
5. Relevant chunks are returned to the LLM
Retrieved documents are added to the prompt:
Context:
[Retrieved chunks]
Question:
How can I recover my account?
The LLM then generates a grounded answer.
Why Vector Databases Are Important in RAG
1. Semantic Search
They search by meaning, not exact keywords.
This is the core reason RAG works well.
2. Fast Retrieval at Scale
Searching millions of vectors manually is slow.
Vector DBs use algorithms like:
-
ANN (Approximate Nearest Neighbor)
-
HNSW
-
FAISS indexing
for very fast retrieval.
3. Better Context for LLMs
The retrieved chunks improve:
-
Accuracy
-
Relevance
-
Hallucination reduction
4. Supports Unstructured Data
Vector DBs work well with:
-
PDFs
-
Articles
-
Emails
-
Chat logs
-
Documentation
-
Resumes
This is why RAG is heavily used in enterprise AI systems.
Common Interview Question
Q: Why not store embeddings in SQL DB?
Answer:
You technically can, but vector databases are optimized for:
-
High-dimensional vector indexing
-
Similarity search
-
Fast nearest-neighbor retrieval
-
Scalability
Traditional databases are slower for semantic vector search.
Important Interview Terms
Embedding
Numerical representation of text meaning.
Similarity Search
Finding vectors closest in semantic meaning.
ANN (Approximate Nearest Neighbor)
Technique for fast large-scale vector retrieval.
Chunking
Breaking documents into smaller searchable pieces.
Real-World Example
Suppose a company has:
-
10 million support documents
When a user asks a question:
-
Query embedding is generated
-
Vector DB retrieves top relevant chunks
-
LLM uses them to answer accurately
Without vector DBs, retrieval would be too slow and less semantic.
Interview-Friendly Summary
Vector databases are used in RAG to store and retrieve embeddings efficiently using semantic similarity search. They help the system quickly find the most relevant contextual information, which improves LLM response accuracy and reduces hallucinations.
