Search Tuning — Per-Index Relevancy Controls

Search Tuning — per-index relevancy controls

Search Tuning is a panel inside the Index Settings of every Opensolr index that lets you shape how results are ranked for that one index, with no code changes and no re-indexing. Every setting is read at query time, so the very next search you run after moving a slider already uses the new value.

This page documents the controls exactly as they appear in the Opensolr admin panel today, with the real defaults from config/search_defaults.php and the live UI in tools_new.php. If you are a Drupal or WordPress user the same settings exist in your CMS module / plugin, with slightly different per-CMS defaults — see the note at the bottom of this page.

Where to find it

Open your index in the dashboard, then hit the Search Tuning section in the Index Settings sidebar:

https://opensolr.com/admin/solr_manager/tools/YOUR_INDEX_NAME

Every change is saved automatically (debounced ~400 ms). There is no Save button.

The controls, in order they appear in the UI

The panel shows different controls depending on whether your index is vector-enabled or pure lexical. Vector-only controls are clearly noted below.

What “vector-enabled” means — and what it costs

A vector-enabled index has a 1024-dimension dense-vector field (embeddings) in its schema, populated at index time by our GPU-backed embedding service (E5-large-instruct, multilingual). That vector field is what unlocks the AI / hybrid capabilities of Opensolr search:

Hybrid Search — lexical (BM25) and semantic (KNN) combined inside {!bool} with the four search modes shown above (Union / Keywords Required / Meaning Required / Intersection).
Pure Vector / Semantic Search — KNN-only mode with the Semantic ↔ Lexical Balance slider pushed all the way left.
AI Hints — the streaming AI summary panel above search results.
AI Reader — the full-screen AI-generated reading-mode for any single document.

None of these features are included in the standard Opensolr hosting plans. Standard plans give you full lexical search, faceting, autocomplete, spellcheck, query elevation, multilingual support, and all the rest — out of the box, no add-ons. Vector and AI features run on dedicated GPU infrastructure (LLM + embedding model) which is significantly more expensive to operate than a CPU-only Solr cluster, so they live on custom AI-enabled plans tailored per customer based on expected query volume, embedding budget, and feature mix.

If your index name ends in __dense (e.g. tdr__dense, fluke__dense) it is already on a vector-enabled plan and the AI controls in this panel are live. If your index name does not end in __dense and is not on the legacy whitelisted vector cores, the Semantic ↔ Lexical Balance, Search Mode and Vector Candidate Pool controls will not appear in the UI — the index is lexical-only.

To enable AI / vector search on an existing index, or to create a new __dense index for it, contact support@opensolr.com for a quote. See also the Hybrid Search architecture overview and the platform docs page on AI & Vector Search.

Defaults at a glance

Control	Default	Range	Vector only?
Field weight: Title	0.10	0.01 – 1.0	no
Field weight: Description	0.05	0.01 – 1.0	no
Field weight: URI	0.01	0.01 – 1.0	no
Field weight: Text	0.01	0.01 – 1.0	no
Field weight: LD Text (`text_t`)	0.01	0.01 – 1.0	no
Semantic ↔ Lexical Balance	0.30	0.10 – 1.00 (10% – 100%)	yes
Minimum Match	Flexible	Flexible / Balanced / Strict / Custom	no
Search Mode	Union	Union / KW Required / Meaning Required / Intersection / Lexical Only	yes
Vector Candidate Pool (topK)	200	10 – 1000	yes
Content Quality Boost	0 (off)	0 – 1.0	no
Minimum Relevance Score	0 (off)	0 – 1.0	no
Results Per Page	10	10 – 200	no

Defaults are read from addons/default/modules/solr_manager/config/search_defaults.php on the Opensolr platform. A NULL value in your index configuration means "use the system default" — clicking the per-control Reset button restores that NULL.

1. Field Weights

Five inputs in a single row: Title, Description, URI, Text, LD Text. Each value is the multiplier applied to that field in the underlying Solr edismax qf / pf parameter when the lexical side of the query runs.

Why the defaults are so small (0.01–0.1): they are deliberately kept low so the lexical score stays in the same magnitude as the KNN cosine similarity score (0–1) when the two are combined inside {!bool}. If you push them much higher, lexical wins everything and semantic relevance becomes meaningless. If you push them lower, semantic wins everything and exact-keyword matches lose. Stick within 0.01–1.0 and tune relative to each other rather than absolutely.

The fifth field (LD Text): the text_t Solr field is the cleaner, structured-text version of text, populated from JSON-LD that the crawler extracted at index time. It usually has less boilerplate / nav / footer noise than text, so giving it a slightly higher weight (e.g. 0.03) often improves ranking on content-heavy sites.

Phrase boosting (Solr pf, pf2, pf3) is automatic on top of qf at 0.8x / 0.4x / 0.2x of the same field weights respectively. This rewards results where the query terms appear close together as a phrase or bigram/trigram. There is no UI for it because the multipliers track the field weights you set.

2. Semantic ↔ Lexical Balance (vector indexes only)

A slider 10% ↔ 100% mapping to a lexical_weight value of 0.10 to 1.00 (default 0.30). It controls how much the lexical (BM25) score contributes to the final ranking versus the vector (KNN) score.

Internally we apply a non-linear reshape: pow(BM25 + 1, lexical_weight) - 1. That means:

1.00 — full BM25, lexical dominates. Best for product/SKU/identifier search.
0.50 — sqrt-like compression of BM25, gentle blend.
0.30 — default, vector takes the lead, lexical plays a supporting role. Good general-purpose value for content sites.
0.10 — lexical heavily compressed, vector dominates. Best for natural-language / question-style queries.

Named-entity matches (people, places, brands) keep their dominance at higher slider values; conceptual / ambiguous queries benefit from lower values.

3. Search Mode (vector indexes only)

Five radio buttons that decide how the lexical query and the vector KNN query are combined inside the {!bool} query. Default is Union.

Mode	{!bool} structure	When to use it
Union (default)	should + should	Broadest. Either signal surfaces a doc. Best for general site search.
Keywords Required	must + should	Doc must match keywords; semantic adds ranking boost. Best for product / part-number / brand search.
Meaning Required	should + must	Doc must be semantically relevant; keywords add ranking boost. Best for natural-language / Q&A queries.
Intersection	must + must	Both signals required. Most precise, most restrictive. Use when you want very tight relevance.
Lexical Only	no vector	Skips the embedding API entirely. Pure BM25. Use when you want deterministic keyword behaviour, the embedding service is slow, or you want a fallback that doesn't depend on vector availability. Automatically zeros the Minimum Relevance Score.

4. Minimum Match (mm)

Controls how many of the user's typed query terms must appear in a document for it to qualify as a match. Default preset: Flexible. Three presets plus a Custom option for raw Solr mm syntax.

Preset	mm value	Behaviour
Flexible (default)	2<65% 4<50% 8<40%	Lenient. Short queries match all terms; 4+ terms need ~50%; 8+ terms need ~40%. Best for broad, conversational searches.
Balanced	2<90% 5<75% 8<60% 12<50%	Proven middle ground. Short queries need ~90%, medium ~75%, long ~60%, very long ~50%.
Strict	2<95% 5<90% 8<80%	Precise. Almost all terms must match. Short queries ~95%, long queries still ~80%.
Custom	your own	Any valid Solr `mm` value: positive integer (`3`), negative integer (`-2`), percentage (`75%`), negative percentage (`-25%`), or tiered (`2<90% 5<75%`).

Tiered syntax: N<M% means — when the query has more than N terms, require M% of them to match. Tiers chain: 2<90% 5<75% means 1–2 term queries require all to match, 3–5 term queries need 90%, 6+ need 75%.

5. Vector Candidate Pool (topK) (vector indexes only)

Slider 10 – 1000 (default 200). Maps directly to the KNN topK parameter: {!knn f=embeddings topK=200}[query_vector]. It is the size of the candidate pool the vector side considers before the {!bool} combiner ranks across both sides.

Higher = broader semantic recall (more dots considered) but slower per-query latency and more facet-count noise; lower = faster and tighter but you may lose semantically relevant docs that didn't make the topK cut. 200 works for most indexes up to ~1M docs. Consider 300–500 for very large indexes where you observe missing semantic matches; consider 50–100 if your index is small and you want maximum speed.

6. Content Quality Boost

Slider 0 – 1.0 (default 0 / off). When > 0, an additive boost is applied via bf=linear(quality_f, weight, 0) on the edismax side. The quality_f Solr field is computed at indexing time based on title length, description length, body text length, and presence of an og:image.

Effect: rich pages (long descriptions, body text, images) are pushed up; thin/stub pages (one-line titles, no body) drop. Useful when your index mixes substantial articles with low-effort pages and you want depth to win.

Try 0.3–0.5 for sites mixing detailed articles with stubs. Leave at 0 if your content is reasonably uniform — the boost just becomes noise.

7. Minimum Relevance Score

Slider 0 – 1.0 (default 0 / off). Applied as a Solr frange post-filter on the combined {!bool}(lexical+vector) score: fq={!frange l=X cache=false cost=200}query($q). Drops results whose final score falls below the threshold.

Why it exists: the lexical side of the query is wrapped in {!func} which means it scores every doc (zero for non-lexical hits), and KNN topK always returns N docs even when cosine similarity is near-zero. Without this filter, numFound inflates to the size of your base filter set and facet counts get polluted by noise docs.

0 — off, all candidates returned. numFound may be huge. Default for backwards compatibility.
0.1 – 0.3 — recommended sweet spot. Kills pure noise (sqrt(BM25+1)-1 easily clears 0.2; cosine >= 0.2 also clears it).
0.4+ — increasingly strict. Use when results feel off-topic and you want only highly relevant docs.

Auto-zeroed when Lexical Only mode is selected — the threshold is calibrated for the combined hybrid score and would filter all results in pure keyword mode.

8. Results Per Page

Slider 10 – 200 (default 10). Number of results the search UI returns per page by default. Applies to both the hosted Search UI at search.opensolr.com/INDEX and to API responses when no explicit rows param is sent.

Higher values give you more results per request but increase response size and bandwidth. 10 is a UI-friendly default. 20–30 works well for grid layouts. 50+ for list-style result pages or when you do client-side filtering across the whole result set.

Quick recipes

Site type	Recommended settings
News / blog surface fresh content fast	Min Match: Flexible. Search Mode: Union. Lexical balance ~ 0.3. Quality Boost 0.3 (penalises thin posts). Min Relevance Score 0.2 (cuts noise from broad queries).
Knowledge base / docs natural-language Q&A	Min Match: Flexible. Search Mode: Meaning Required. Lexical balance 0.2. Bump Text and LD Text weights to 0.05–0.10. Quality Boost 0.3. Min Relevance Score 0.25.
E-commerce / product catalog exact matches matter	Min Match: Strict. Search Mode: Keywords Required. Lexical balance 0.6–0.8. Title weight 0.2–0.3 (product names live in titles). Min Relevance Score 0.3.
Internal site search / navigation find the page they typed	Min Match: Balanced. Search Mode: Lexical Only (no embedding round-trip; faster). Lexical balance 0.7. URI weight 0.05 (URL slugs help on nav-style queries).
Mixed-content portal the system defaults	Don't change anything. The shipped defaults (Flexible / Union / 0.30 lexical / topK 200) are tuned for general-purpose hybrid search.

How it works under the hood

You move a slider. JS calls POST /admin/solr_manager/save_search_tuning/{INDEX} with the field name and the new value. CSRF-protected. Debounced ~400 ms.
The value is stored on your index row in default_solr_cores.search_* (e.g. search_lexical_weight, search_mm, search_bool_mode). NULL means “use the system default from search_defaults.php”.
On the next search request, Hybrid_search.php reads your row, merges it on top of search_defaults.php, and constructs the final Solr query. No restart, no re-index.
Reset sends a special null sentinel that nulls the column, dropping the override entirely.

Reset behaviour

Every control has a small Reset button next to it that restores that one setting to the system default (it nulls the corresponding DB column on your index). At the bottom of the panel a Reset All to Defaults button does the same for every control at once.

CMS module / plugin equivalents

The same controls exist in the official Drupal module and WordPress plugin, but they are stored at the CMS layer and have slightly different shipped defaults so the modules feel right out of the box for those CMSes:

Drupal Opensolr Search module — Configuration → Search and metadata → Opensolr Search → Search Tuning tab. Documented at opensolr.com/opensolr-drupal-search-docs/search-tuning.
WordPress Opensolr Search plugin — Settings → Opensolr Search → Search Tuning tab. Documented at opensolr.com/wordpress-search-docs/search-tuning.

If you operate the index directly via the Opensolr admin (not through a CMS module), this page (FAQ #202) is the canonical reference and the values you see in the admin UI match the table above.

Frequently asked

Do I need to re-index after changing tuning settings?
No. Tuning is read at query time. The next search uses your new values.

Why is the Semantic ↔ Lexical Balance slider missing on my index?
It only appears for vector-enabled indexes — those with a dense-vector field in the schema (typically the __dense suffix or one of the legacy whitelisted vector cores). Lexical-only indexes don't have a vector side to balance against.

What's the difference between Content Quality Boost and Minimum Relevance Score?
Quality Boost rewards rich docs (additive, in the score). Minimum Relevance Score filters out low-score docs (post-filter, before they reach the response). They're independent and you can use both together.

If I switch to Lexical Only, what happens to the vector / topK / balance settings?
They become inert. The query stops calling the embedding API entirely — no {!knn} clause is generated, no {!bool} combiner runs, just plain edismax with your field weights and mm. Min Relevance Score is auto-zeroed because its threshold is calibrated for the combined hybrid score.

Can I version / export / restore my tuning settings?
The values live on the default_solr_cores row for your index. You can read them via the Get Index API and re-apply them via the same admin save endpoint. There is no built-in version history yet.

The Freshness Window control I used to see is gone — where did it go?
Freshness Window and Default Search Mode controls were removed from the UI in April 2026 because users now set their own fresh= filter and date facets directly in the search UI. The backend settings are still readable for backwards compatibility but no longer surfaced in the admin panel.