Edition #8: Building Search That Doesn't Suck (Vector + Keyword)

both


            
        March 3, 2026
    
    
Edition #8: Building Search That Doesn't Suck (Vector + Keyword)


        Welcome back to Fine-Tuned. This week we are fixing your search bar.\n\n### 🔬 The Deep Dive: Hybrid Search is Mandatory\n\nIf you replaced your application’s standard keyword search with a pure Vector Search (Embeddings) over the last year, your users are probably frustrated. \n\nVector search is incredible for conceptual queries (\”Show me documents about budget constraints\”). But it is notoriously terrible at exact keyword matching (\”Show me invoice #INV-49201\”). \n\n**The Solution: Hybrid Search (BM25 + Vector)**\n\nYou need to combine both methods and rank them. Here is the modern playbook for search:\n1. Dense Vector Search: Embed your documents using an open-source embedding model (like bge-m3) to capture semantic meaning.\n2. Sparse Keyword Search: Use an algorithm like BM25 (the engine behind Elasticsearch) to map exact token matches.\n3. Reciprocal Rank Fusion (RRF): Run both searches in parallel, then mathematically combine the ranked lists so that a document scoring high in both semantic meaning and exact keyword match rises to the top.\n\n*Tactical tip:* Stop using expensive vector databases for basic search. PostgreSQL with pgvector now supports HNSW indexing, meaning you can keep your vectors right next to your relational data. It’s cheaper, simpler, and completely fine for anything under 10 million rows.\n\n—-\n\n### 🗞️ The Roundup: 3 Big Updates This Week\n\n**1. OpenAI Drops Embeddings Pricing Again:** The cost to embed a million tokens is now fractions of a cent, officially making chunking strategy the only real bottleneck in RAG pipelines, not cost.\n**2. Contextual Retrieval by Anthropic:** Anthropic just open-sourced a new methodology where every chunk of text in your database is pre-pended with an AI-generated summary of the whole document.\n**3. The Rise of ‘Small’ Open Weights:** Meta’s new 1B and 3B parameter models are outperforming GPT-3.5 on reasoning tasks.\n\n—-\n\n### 🛠️ Tool of the Week: Qdrant\n\nIf you do need a dedicated vector database because you are operating at massive scale (100M+ vectors), Qdrant is currently the developer favorite. It’s written in Rust, insanely fast, supports hybrid search natively.\n\n—-\n\n*Keep building.*\n- Kyle Anderson
    

                            Don't miss what's next. Subscribe to My Awesome Newsletter:
                        
                    
            Email address (required)