Search Engineering

BM25 vs TF-IDF: Why BM25 Delivers Better Search Results

Published April 18, 2026 · Last updated April 29, 2026

If you're building a search engine, you'll inevitably encounter TF-IDF and BM25. Both are term-based ranking algorithms that score how relevant a document is to a query. But BM25 consistently produces better results in practice. Here's why.

What TF-IDF does

TF-IDF (Term Frequency-Inverse Document Frequency) scores a document based on two factors: how often a term appears in the document (TF) and how rare that term is across all documents (IDF). The intuition is simple — a term that appears frequently in one document but rarely across the corpus is probably important to that document.

The problem is that TF grows linearly. A document that mentions "search" 100 times scores much higher than one mentioning it 10 times. In reality, after a certain point, more repetitions don't mean more relevance — they might mean keyword stuffing.

How BM25 improves on TF-IDF

BM25 (Best Matching 25) adds two critical improvements:

1. Term frequency saturation

BM25 introduces a saturation curve controlled by parameter k1. Instead of TF growing linearly, it approaches an asymptote. A document mentioning "search" 10 times scores only slightly lower than one mentioning it 100 times. This prevents long, repetitive documents from dominating results.

2. Document length normalization

Parameter b controls how much document length affects scoring. A short document that mentions your query term once is probably more focused and relevant than a 10,000-word document that mentions it once in passing. BM25 normalizes for this using the ratio of document length to average document length in the corpus.

BM25F: field-level weighting

BM25F extends BM25 to support multiple fields with different weights. In a product catalog, a match in the title should matter more than a match in the description. BM25F lets you assign weights per field:

Title: weight 3.0
Brand: weight 2.0
Description: weight 1.0

The weighted term frequencies are combined before applying the saturation function, giving you a single relevance score that accounts for where matches occur.

Practical impact

In LumoSearch, BM25F is the default ranking algorithm. When you create a collection with weighted fields and upload documents, the indexing worker computes term frequencies, document lengths, and IDF values for the entire corpus. At search time, scoring is a simple arithmetic operation on the pre-computed index.

The result: relevant, well-ranked search results without any manual tuning. BM25F has been the standard in information retrieval research for over two decades because it works.

When TF-IDF is enough

For very small corpora (under 100 documents) or when all documents are roughly the same length, TF-IDF and BM25 produce similar results. But as your corpus grows and document lengths vary, BM25's saturation and normalization become essential for good relevance.

Try BM25F ranking in LumoSearch

LumoSearch uses BM25F by default. Create a collection with weighted fields, upload your data, and get relevant search results immediately. Learn more about features.