Hybrid Search
A retrieval approach that combines traditional keyword matching (BM25) with semantic vector search to capture both the precision of exact term matches and the contextual understanding of meaning-based search.
Why it matters
Hybrid search consistently outperforms single-strategy retrieval by 10-30%, making it the default production choice for RAG systems that need both precision and semantic understanding.
Why Neither Approach Alone Is Enough
Keyword search (BM25) excels at finding exact terms — product codes, proper nouns, specific error messages — but completely misses synonyms and paraphrases. Search for "automobile repair" and it won't return results about "car maintenance." Semantic search handles meaning beautifully but can struggle with precise terms: it might rank a general article about vehicles higher than the specific technical spec you need because the embeddings are close in vector space.
These aren't edge cases. In production RAG systems, purely semantic retrieval typically misses 15-25% of results that keyword search would catch, and vice versa. Each approach has a blind spot that the other covers.
How the Combination Works
Hybrid search runs both retrieval strategies in parallel against the same corpus:
- BM25 retrieval — Scores documents based on term frequency, inverse document frequency, and document length normalisation. Returns a ranked list based on lexical overlap.
- Vector retrieval — Embeds the query and finds the nearest document vectors by cosine similarity. Returns a ranked list based on semantic closeness.
- Reciprocal Rank Fusion (RRF) — Merges both ranked lists into a single result set. Each document gets a score based on its rank position in each list (1/(k + rank)), and the scores are summed. Documents appearing in both lists get a significant boost.
Production Best Practices
Most vector databases now support hybrid search natively. Pinecone offers sparse-dense vectors, Weaviate has a built-in hybrid query API, and pgvector can be combined with PostgreSQL's full-text search in a single query.
The weighting between keyword and semantic results (often called alpha) matters. A 0.5/0.5 split is a reasonable default, but domains with lots of technical jargon or identifiers tend to benefit from heavier keyword weight (0.6-0.7 BM25), while conversational or exploratory queries favour semantic weight. Some teams make alpha dynamic based on query characteristics — short, specific queries get more keyword weight, while longer natural-language questions lean semantic.
Related Terms
On the AI Radar
From Our Blog

Beyond Basic RAG: Chunking, Hybrid Search, and Reranking
Most RAG tutorials stop at "it works." This one shows you how to make it work well.

How to Build a RAG System That Actually Works
Most RAG tutorials skip the hard parts. This one doesn't — here's how to actually ship a working system.