Search Tuning & Relevance

Fine-tune how results are ranked

Search Tuning

Search tuning lets you control how results are ranked when someone searches your Opensolr index. Think of it as a mixing board in a recording studio: each slider changes how much weight a particular signal carries in the final output. Adjusting these settings can dramatically improve the quality and relevance of your search results. For a deep dive into every tuning parameter, see the Search Tuning reference in the knowledge base.

How Search Tuning Works

Every document in your index has multiple fields: a title, a description, body text, the URL, and more. When a user searches, Opensolr scores each document by checking how well the query matches each field. The field weights you set determine how much each field contributes to the final relevance score.

The Relevance Mixer Each slider controls how much a field influences your search ranking Title 0.10 Description 0.05 URI 0.01 Text 0.01 LD Text 0.01 RANK Ranked Results #1 Best matching document #2 Second best match #3 Third best match

Field Weights

Field weights tell Opensolr which parts of your documents matter most. A higher weight means matches in that field count more toward the relevance score. For example, if a user's search query appears in the title of a page, that is usually a stronger signal of relevance than if it only appears somewhere in the body text.

Field What It Contains Default Weight When to Increase
Title The page title (from the HTML <title> tag) 0.10 When page titles are descriptive and unique
Description The meta description or page summary 0.05 When descriptions are carefully written summaries
URI The page URL (e.g., /products/blue-shoes) 0.01 When URLs contain meaningful keywords
Text The full body text of the page 0.01 When pages have rich, well-written content
LD Text The cleaner JSON-LD / structured data text (text_t Solr field) 0.01 When your site has good schema.org markup and you want structured content to weigh more than raw body text
How to adjust

Each weight ranges from 0.01 (almost ignored) to 1.0 (maximum importance). Move the slider or type a value directly. Changes take effect on the next search query. The defaults are deliberately kept low (max 0.10 on Title) so the lexical score stays in the same magnitude as the KNN cosine similarity score (0–1) when combined inside the hybrid {!bool} query — pushing weights much higher means lexical wins everything and semantic relevance becomes meaningless. To understand how these weights interact with the eDisMax query parser under the hood, see The eDisMax Query Parser Explained.

Start with defaults

The default weights work well for most websites. Only adjust them after testing your search results and finding room for improvement.

Test with real queries

After changing weights, search for terms your users actually search for. Check whether the most relevant pages now appear near the top.

Minimum Match

Minimum Match (sometimes called mm) controls how many of the search words must appear in a document for it to be considered a match. This is one of the most powerful relevance controls you have.

Imagine a user searches for "best italian restaurant new york". That is 5 words. Should a document match if it contains only 2 of those words? Or should all 5 be required?

Minimum Match Modes Compared Query: "best italian restaurant new york" (5 words) Flexible 3 out of 5 words must match best italian restau... new york INCLUDED 3/5 words found More results, some loosely related Balanced 4 out of 5 words must match best italian restau... new york INCLUDED 4/5 words found Good balance of precision and recall Strict All 5 words must match best italian restau... new york ONLY IF ALL MATCH 5/5 words required Fewest results, but highly relevant

Flexible (Default)

Returns the most results. 2<65% 4<50% 8<40%. Short queries match all terms; 4+ terms need ~50%; 8+ terms need ~40%. Great for broad, conversational searches across content sites.

Balanced

A good middle ground. 2<90% 5<75% 8<60% 12<50%. Most search words must appear, with up to 50% missing on very long queries. Use when defaults feel too loose.

Strict

Almost every word must appear in the document. 2<95% 5<90% 8<80%. Use when precision matters more than finding many results — product catalogs, legal documents, exact-match-driven search.

Custom expressions

Advanced users can type a custom minimum match expression, like 2<75% (which means: for queries with more than 2 words, at least 75% must match). These use Solr's mm parameter syntax. Most users should stick with the three preset modes. For more on how these fields are analyzed and tokenized before matching, see Best Fulltext Solr Fields.

Content Quality Boost

The Content Quality Boost rewards documents that have rich, substantial content and gently pushes down documents with thin, sparse content. Think of it as a "depth detector" for your pages.

Thin Page 50 words Quality Boost 0.60 0.0 1.0 Rich Page 1,200 words
  • 0.0 (default) = Quality boost is turned off. All documents scored purely on keyword + semantic relevance.
  • 0.5 = A moderate boost. Rich content gets a gentle lift; thin content is slightly downranked.
  • 1.0 = Maximum boost. Pages with substantial, in-depth content are strongly favored.
When to use it

Default is 0 (off) — leave it there if your pages have roughly uniform depth. Enable a moderate quality boost (0.3 to 0.6) if your site mixes long articles with short stub pages and you want detailed articles to rank higher. The boost uses the quality_f Solr field which is computed at indexing time from title length, description length, body text length, and presence of an og:image.

Results Per Page

This slider controls how many search results are returned per page. Range 10 to 200, default 10. Applies both to the hosted Search UI and to API responses when no explicit rows parameter is sent.

  • 10-20 — Best for clean, focused search pages. Users see the top results quickly.
  • 30-50 — Good for research-heavy use cases where users want to browse many results at once.
  • 100-200 — Suitable for internal tools or data exploration dashboards. Note: larger numbers may make pages load slightly slower.

Vector Search Settings

The following settings appear only for indexes with vector search enabled. If your index uses keyword-only search, you will not see these options. Vector search uses AI to understand the meaning behind words, not just the exact words typed.

Semantic vs. Keyword Balance

This is the most important vector search setting. It controls the balance between traditional keyword matching (finding documents that contain the exact words typed) and semantic/AI matching (finding documents that are about the same topic, even if they use different words).

Semantic vs. Keyword Balance Semantic Understands meaning "car" also finds "vehicle" Lexical Exact word matches (BM25) "car" finds "car" only Lexical Weight Slider (default 0.30) 10% (mostly Semantic) 100% (full Lexical) 0.30 default Slide left = more semantic weight | Slide right = more keyword weight

The slider value maps directly to the lexical_weight parameter (range 0.10 to 1.00, default 0.30). Internally we apply a non-linear reshape pow(BM25 + 1, lexical_weight) - 1, so the slider shapes the lexical contribution curve, not just scales it linearly.

  • 1.00 (full right) — full BM25, lexical dominates. Best for product / SKU / identifier search.
  • 0.50 — sqrt-like compression of BM25, gentle blend.
  • 0.30 (default) — vector takes the lead, lexical plays a supporting role. Good general-purpose value for content sites.
  • 0.10 (full left) — lexical heavily compressed, vector dominates. Best for natural-language / question-style queries.
Recommendation

The default 0.30 is the right starting point for general-purpose hybrid search and works well for most content sites. Push it to 0.50–0.80 if your users search for exact phrases, product names, SKUs, or named entities. Push it down to 0.15–0.20 for natural-language Q&A or knowledge-base queries where meaning matters more than exact words. For a detailed explanation of how hybrid search blends keyword and vector scoring, see Hybrid Search in Opensolr.

Search Mode

Search Mode controls how keyword results and semantic results are combined inside the underlying {!bool} Solr query. Think of it as choosing how the two result lists are merged together. Default is Union.

Union (Default, Broadest)

Shows results from either keyword matches or semantic matches. If a document appears in either list, it is included. This gives you the most results and the broadest coverage.

Keywords Required

A document must contain the search keywords, but its rank is also influenced by semantic similarity. Guarantees keyword relevance while using AI to improve ordering.

Meaning Required

A document must be semantically similar to the query (the AI must consider it relevant), but keyword matches boost the score further. Good for conceptual searches.

Intersection (Strictest)

A document must appear in both the keyword results and the semantic results. Only documents that match the exact words AND are semantically relevant are shown. Fewest results, highest precision.

Lexical Only

Skips the embedding request entirely and runs pure keyword (BM25) search. No vector scoring at all. Useful when you want fully deterministic keyword ranking, or want to avoid embedding latency. Automatically sets Minimum Relevance Score to 0.

Vector Candidate Pool (topK)

When Opensolr performs a semantic search, it first asks the AI to find the most similar documents from your entire index. The topK slider controls how many candidates the AI considers.

  • Lower topK (e.g., 50) = Faster, but might miss some relevant documents that are further down the similarity list.
  • Higher topK (e.g., 500) = Considers more candidates, potentially finding more relevant results, but takes slightly longer.
Default: 200

The default value of 200 works well for most indexes. Increase it only if you have a very large index (100,000+ documents) and feel that semantic search is missing relevant results.

Minimum Relevance Score

A slider from 0 to 1.0 (default 0 / off) that removes results whose combined keyword + semantic score falls below the threshold. Applied as a Solr frange post-filter on the combined {!bool}(lexical+vector) score — only results that clear the bar are shown, and they don't pollute facet counts or numFound.

  • 0 (default, off) — all candidate results are returned. Facet counts may include near-zero noise matches.
  • 0.1–0.3 — recommended sweet spot. Removes pure noise while keeping legitimate matches.
  • 0.4–1.0 — increasingly strict. Use only if results feel very off-topic.
Automatically set to 0 in Lexical Only mode

The threshold is calibrated for the combined hybrid score. In Lexical Only mode the score distribution is different, so the slider is automatically zeroed when that mode is selected.

Reset All Settings

If you have experimented with many settings and want to start fresh, click the Reset All button at the bottom of the search tuning panel. This restores every slider and option to its factory default. Your search results will go back to the standard relevance ranking.

Heads up

Resetting is instant and cannot be undone. If you have a configuration you like, make a note of your settings before resetting.

Search Tuning Tips

1 Start simple

Begin with the default settings. Only change things if your search results are not satisfactory. Small adjustments often make a big difference.

2 Change one thing at a time

Adjust a single slider, run your test queries, and evaluate. Changing multiple settings at once makes it impossible to know which change helped (or hurt).

3 Use real queries

Test with the actual search queries your users type, not made-up examples. Check your Analytics page to see what people search for.

4 For e-commerce, boost Title

Product searches usually work best with a higher Title weight (try 0.20–0.40) so product names dominate ranking. Stay under 1.0 so the lexical score doesn't drown out semantic similarity in the hybrid combiner.

5 For blogs/docs, boost Text and LD Text

Content-heavy sites often benefit from raising Text and LD Text from the default 0.01 to 0.05–0.10. The answer to the user's query is usually buried deep in the page body, not just the title. LD Text is the cleaner JSON-LD-derived field with less boilerplate noise.

6 Vector search is not always better

If your users search for very specific terms (part numbers, SKUs, exact phrases), keyword search will outperform semantic search. Use the balance slider to favor keywords in those cases.

Further reading

For general guidance on building high-quality search experiences, see Solr Best Practices. It covers schema design, indexing strategies, and query optimization tips that complement the tuning controls described on this page.

← Data Ingestion Facet Filters →