Querying the Solr API: Search Parameters Explained
Querying the Solr API: Search Parameters Explained
What is the Solr /select API?
Apache Solr exposes a powerful search API at the /select endpoint. This is how you query your index — you send an HTTP request with search parameters, and Solr returns matching documents as JSON.
Your Opensolr Index has its own URL (found in your Index Control Panel), and you query it like this:
https://YOUR_INDEX_HOST/solr/YOUR_INDEX_NAME/select?q=your+search+terms&wt=json
Your index is protected by HTTP Basic Authentication over HTTPS. The default username is opensolr and the password is your Secret API Key (found in your Control Panel Dashboard). You can change these credentials in the Security section.
🔍 The Solr Query Inspector: Your Cheat Sheet
Here is the best part: you do not need to figure out all these parameters from scratch.
The Opensolr Search UI has a built-in Solr Query Inspector that shows you the exact parameters used for every search. Here is how to use it:
- Go to your search UI at
https://search.opensolr.com/YOUR_INDEX_NAME - Type a search query and hit enter
- Look for the magnifying glass icon in the bottom-right corner of the page
- Click it to open the Solr Query Inspector modal
- You will see a table listing every single parameter sent to Solr, with their values
- Click the "Copy Params" button to copy all parameters to your clipboard
You can then paste these parameters directly into your application. This is the fastest way to get a working hybrid search — just copy what the built-in UI does and adapt it.
Essential Parameters
Here are the most important parameters for querying your Web Crawler index:
Basic Search
| Parameter | Description | Example |
|---|---|---|
q |
The search query | q=opensolr documentation |
df |
Default search field (if q does not specify a field) |
df=title |
rows |
Number of results to return | rows=10 |
start |
Offset for pagination (0-based) | start=20 (page 3 with rows=10) |
wt |
Response format | wt=json |
fl |
Fields to return (comma-separated) | fl=id,uri,title,description,og_image,score |
sort |
Sort order | sort=creation_date desc or sort=score desc |
Filter Queries (fq)
Filter queries narrow results without affecting the relevancy score. They are also cached by Solr, making repeated filtering very fast.
fq=content_type:text* // Only HTML pages fq=meta_domain:yoursite.com // Only from a specific domain fq=meta_detected_language:en // Only English pages fq=creation_date:[NOW-7DAY TO *] // Only from the last 7 days fq=-uri_s:*/admin* // Exclude admin pages
You can use multiple fq parameters — they are AND-ed together.
🧠 Hybrid Search: Combining Keywords and AI
This is where Opensolr really shines. The hybrid search combines traditional keyword matching (lexical search) with AI-powered semantic search (vector similarity) in a single query.
The formula looks like this:
q={!func}sum(
product(VECTOR_WEIGHT, query($vectorQuery)),
product(LEXICAL_WEIGHT, div(log(sum(1, query($lexicalQuery))), sum(log(sum(1, query($lexicalQuery))), 20)))
)
Do not panic — here is what it means in plain English:
vectorQuery— Searches the vector embeddings field using KNN (K-Nearest Neighbors). Finds pages that are semantically similar to your query, even if they do not share the same exact words.lexicalQuery— Traditional keyword search using Solr's eDisMax parser. Finds pages that contain your exact search terms, with intelligent weighting.- The formula combines both scores — vector similarity gets direct weight, while the lexical score is dampened with a logarithmic function to prevent keyword-stuffed pages from dominating.
The Sub-Queries
Vector Query:
vectorQuery={!knn f=embeddings topK=250}[embedding_vector_here]
This tells Solr to find the 250 nearest documents in the vector space. The embedding vector is generated from your search query text using the same AI model that was used to embed the indexed pages.
Lexical Query:
lexicalQuery={!edismax
qf="title^5 description^4 uri^0.5 text^0.01"
pf="title^10 description^8 uri^1 text^0.02"
pf2="title^5 description^4 uri^0.5 text^0.01"
pf3="title^2.5 description^2 uri^0.25 text^0.005"
ps=0 ps2=1 ps3=2
mm="2<-1 5<-2"
}your search terms here
This is the keyword search component. The field weights are:
title(5x) — Most important. A match in the title is worth 5x a match in body text.description(4x) — Second most important.uri(0.5x) — URL matches get a small boost.text(0.01x) — Body text matches get minimal direct weight (but phrase matches inpfboost them).
The pf / pf2 / pf3 parameters boost exact phrases, bigrams, and trigrams respectively, which means results that match your query as a complete phrase rank higher than results that just happen to contain the same words scattered around.
mm (minimum match) controls how many of your search terms must appear: for queries with 2-4 terms, all but one must match; for 5+ terms, all but two must match.
Highlighting
To get search term highlighting in your results (bold matching words in snippets):
hl=true hl.fl=uri,title,description,text hl.method=unified hl.fragsize=200 hl.snippets=1 hl.tag.pre=<em> hl.tag.post=</em>
The highlighted text is returned in a separate highlighting section of the JSON response, keyed by document id.
Spellcheck
To get "Did you mean...?" suggestions:
spellcheck=true spellcheck.q=your search terms spellcheck.count=5 spellcheck.collate=true spellcheck.maxCollationTries=15 spellcheck.maxCollations=3
The suggestions appear in the spellcheck section of the response.
Faceting
To get counts of results grouped by field values (e.g., how many results per language):
facet=true facet.field=meta_detected_language facet.field=currency_s facet.mincount=1 facet.sort=index
This returns a facet_counts section with value-count pairs.
🚫 Excluding Pages from Results
You can exclude certain pages or URL patterns from your search results using filter queries. This is useful for hiding taxonomy pages, tag pages, admin pages, or other non-content URLs.
In the Opensolr Control Panel, you can configure these exclusions in the search.xml configuration file. They are added as fq (filter query) parameters in the appends section, which means they are automatically applied to every search query:
<lst name="appends"> <str name="fq">-uri_s:*/taxonomy*</str> <str name="fq">-uri_s:*/admin*</str> <str name="fq">-uri_s:*/tag/*</str> <str name="fq">-uri_s:*/store-locator*</str> </lst>
The - prefix means "exclude". The uri_s field is the exact string version of the URL, and * is a wildcard. So -uri_s:*/taxonomy* means "exclude any URL containing /taxonomy".
You can also exclude by other fields:
<str name="fq">-og_image:"https://example.com/default-placeholder.jpg"</str>
This is a powerful way to clean up your search results without re-crawling.
Tip: You can also apply these exclusions dynamically in your application by adding
fqparameters to your API queries, without modifying the search.xml config.