The OpenSolr Search Drupal Module
AI-powered search for Drupal 10 & 11 — zero configuration, automatic facets, hybrid vector search. Replace any Drupal search solution in under 30 minutes.
Why Switch to OpenSolr Search?
Whether you are using Drupal's built-in core search, Search API Solr, Elasticsearch, or any other search solution, there are compelling reasons to move to the OpenSolr Search Drupal Module:
The Solr 10.x Problem
Solr 10 introduces a hard limit of 1,000 schema fields. This is a breaking change for any Drupal site that relies on Search API Solr, because the module creates separate Solr fields for every Drupal field, in every language, in every index. A multilingual site with 3 languages, 50 content fields, and 2 indexes can easily exceed 1,000 schema fields — making Search API Solr fundamentally incompatible with Solr 10.x on multilingual sites.
The OpenSolr Search module is not affected by this limit. It uses the web crawler approach — content is indexed as flat text and metadata extracted from HTML, not mapped field-by-field to the Solr schema.
The Boolean Clause Problem
Search API Solr generates excessively complex queries that routinely hit the maxBooleanClauses safety limit (default 1,024). A simple 5-word search can expand into thousands of boolean clauses because the module queries every mapped field with boosting, synonyms, and phrase variations. Solr 10.x is removing the ability to override this limit entirely.
Complexity vs. Simplicity
Search API Solr requires you to manually map every Drupal field to Solr, maintain complex schema configurations, install separate modules for facets, autocomplete, and highlighting, and troubleshoot silent failures when configs drift. The OpenSolr Search module replaces all of this with a single module that works out of the box.
Since this is a Drupal module, this guide focuses primarily on users migrating from Search API Solr — the most common Drupal search backend — but the installation and configuration steps apply regardless of what you are replacing.
How It Works
The OpenSolr Search module offers two equally powerful indexing methods. You can use one or both simultaneously:
Method 1: Web Crawler
The OpenSolr Web Crawler visits your site over HTTPS, reads every published page, and indexes all content automatically — exactly like Google does. The module injects structured <meta> tags and JSON-LD data on your pages so the crawler can extract rich, facetable fields.
What the crawler discovers automatically on every page:
- All Open Graph meta tags (
og:title,og:description,og:image,og:type,og:locale) - All Twitter Card meta tags
- All JSON-LD structured data — every schema.org type: Article, Product, Recipe, Event, Course, JobPosting, FAQPage, HowTo, BreadcrumbList, and hundreds more
- All Microdata attributes (
itemprop,itemtype,itemscope) - Product data: prices, currencies, brands, categories, SKUs (from JSON-LD, Open Graph, and microdata)
- Full page text with structure preserved (headings, paragraphs, lists, tables)
- Attached documents: PDFs, DOCX, XLSX, PPTX, ODT — full text extraction
- Language and locale detection
- Sentiment analysis (positive/negative/neutral scores)
- Vector embeddings for semantic/AI search (BGE-m3, 1024 dimensions)
Method 2: Data Ingestion API
The Data Ingestion API pushes content directly from Drupal to Solr — no crawler involved. This is fully automated inside the module:
- Real-time sync (one checkbox): When you create, update, unpublish, or delete a node in Drupal, the search index updates within seconds. No cron required — it happens on entity save.
- Bulk ingestion: One-click "Ingest All Now" button queues your entire site for processing. Drupal cron sends documents in configurable batches (500-5,000 per cron run, 50 per API call).
- Field mappings create Solr fields automatically: When you map a Drupal field in the Field Mapping tab (e.g.,
field_color→color_sm), the Data Ingestion API sends that field value directly to Solr as a typed, facetable field. No schema editing. The field exists in Solr the moment the first document is ingested. - Handles 500K+ documents with constant memory per batch — no memory spikes, no timeouts.
- Works for private sites: Intranets, password-protected sites, and sites behind firewalls that a crawler cannot reach.
- Commerce product support: Prices, brands, SKUs, and taxonomy category hierarchies are extracted and pushed automatically.
- Multilingual: Each translation is ingested separately with the correct locale, language tags, and translated field values (including taxonomy term labels).
Both methods use the same Field Mapping configuration — map a field once, and it works whether content is crawled or ingested. The crawler reads the mapped values from <meta> tags injected on the page; the Data Ingestion API reads them directly from Drupal entity fields and pushes them to Solr.
Installation & Setup
Step 1: Install the Module
composer require drupal/opensolr_search drush en opensolr_search
The module has zero conflicts with Search API, core search, or any other search module. You can run them side by side during transition.
Step 2: Connect to OpenSolr
Go to Administration > Configuration > Search and metadata > Opensolr Search (/admin/config/search/opensolr).
Enter your OpenSolr account email and API key (from your OpenSolr dashboard), select a server region closest to your users, and click Save & Connect. The module automatically creates and configures a vector-enabled Solr index for you — no schema setup needed.
Step 3: Select Content Types
In the Data Crawler tab, select which Drupal content types and Commerce product types to include in search. Check Include attached files if you want PDFs, DOCX, and XLSX documents to be searchable too.
Step 4: Start Crawling
Click Start Crawl Schedule. The crawler begins indexing your site automatically. You can monitor progress (pages indexed, pages in queue, errors) directly from the Drupal admin.
Crawler settings you can tune:
- Parallel threads (1-20): how many pages to fetch simultaneously
- Request delay (0.1-10s): pause between requests to respect server limits
- Crawl mode: shallow (sitemap only, recommended) or deep (follow all links)
Step 5: Configure Fields, Facets, and Search Tuning
This is where the power is. See the detailed sections below for Field Mapping, Faceted Search, and Search Tuning.
Step 6: Go Live
Your search page is already live at /opensolr-search. The module automatically redirects Drupal's built-in search forms to this page. When you are satisfied:
- Redirect any old search URLs to
/opensolr-search(via Drupal Redirect module or.htaccess) - Disable and uninstall your previous search modules
- Done. No data migration needed — the crawler or Data Ingestion API has already indexed everything independently
Faceted Search — The Complete Guide
This is the most powerful feature of the module. You can create faceted navigation from two sources: fields the crawler discovers automatically, and custom Drupal fields you map manually.
Source 1: Auto-Discovered Fields
The crawler automatically extracts and indexes metadata from every page it visits. These fields are immediately available as facets without any configuration:
| Field | Source | Facet Example |
|---|---|---|
category_s |
Content type / og:type |
Article, Page, Blog Post, Product |
author_s |
<meta name="author"> |
John Smith, Jane Doe |
language_s |
Language detection | English, Spanish, French |
og_locale_s |
<meta property="og:locale"> |
en_us, es_es, fr_fr |
price_f |
JSON-LD Product.offers.price | Price slider ($0 — $999) |
currency_s |
JSON-LD priceCurrency | EUR, USD, GBP |
content_type_s |
HTTP Content-Type | Web pages vs Documents |
product_category_s |
JSON-LD Product.category | Electronics > Phones |
Plus every JSON-LD property. If your pages have Recipe schema, the crawler indexes recipeCategory, cookTime, recipeYield. If you have Event schema, it indexes eventDate, location, organizer. The crawler handles every schema.org type in existence — Article, Product, Recipe, Event, Course, JobPosting, FAQPage, HowTo, and hundreds more.
Source 2: Custom Drupal Field Mapping
For fields that are not in your HTML metadata (internal Drupal fields like taxonomy terms, custom fields, entity references), the module lets you map any Drupal field to a Solr-facetable field.
Go to the Field Mapping tab in the module admin. For each field mapping you configure:
- Drupal Field: Select from all available node and Commerce product fields
- Meta Tag Name: The name used in the index (e.g.,
genre,difficulty,color) - Solr Type: Choose the right type for your data:
| Solr Type | Use For | Facet Widget |
|---|---|---|
_s (String) |
Taxonomy terms, categories, tags | Checkbox list |
_sm (String multi-valued) |
Fields with multiple values | Checkbox list |
_f (Float) |
Prices, ratings, scores | Min/Max slider |
_fm (Float multi-valued) |
Multiple numeric values | Min/Max slider |
_i (Integer) |
Counts, quantities, years | Min/Max slider |
_im (Integer multi-valued) |
Multiple integer values | Min/Max slider |
_dt (Date) |
Publication date, event date | Date range picker |
_dtm (Date multi-valued) |
Multiple dates | Date range picker |
- Display Label: Human-readable label shown in the search sidebar (e.g., "Genre", "Difficulty Level")
The module uses these mappings in both indexing methods: it injects <meta property="opensolr:field_name_type"> tags on your pages for the crawler to read, AND it pushes the mapped field values directly via the Data Ingestion API. Either way, the fields appear in Solr as typed, facetable fields — no Solr schema configuration needed.
Important for multi-valued fields: If your Drupal field has multiple values (e.g., a Tags taxonomy reference with multiple terms), you must use a multi-valued Solr type (_sm, _fm, _im, _dtm). The module validates this automatically.
Configuring Facets
After the crawler has indexed your site, go to the Field Mapping tab. Below the field mapping section, you will see a Facet Configuration table showing all facetable fields discovered in your index, with document counts.
For each field you can:
- Enable/Disable it as a visible facet in the search sidebar
- Set the widget type:
- List (checkboxes) — for string fields like categories, authors, tags
- Slider (min/max range) — for numeric fields like price, rating
- Date Range (date picker) — for date fields like publication date
- Set the display label shown to users
- Set the display order (weight) — drag to reorder
Concrete Use Cases
Use Case 1: Language Teaching Platform
A Spanish language teaching site with articles, lessons, and resources.
Field Mappings:
| Drupal Field | Meta Tag Name | Solr Type | Facet Label |
|---|---|---|---|
field_level (taxonomy) |
level |
_s (string) |
Level |
field_topic (taxonomy) |
topic |
_sm (string multi) |
Topic |
field_resource_type (taxonomy) |
resource_type |
_s (string) |
Resource Type |
field_target_audience (list) |
audience |
_s (string) |
Audience |
Result: Users search for "conjugation exercises" and filter by Level (A1, A2, B1, B2), Topic (Grammar, Vocabulary), and Resource Type (Worksheet, Video, Article). All facets update counts in real time as filters are applied.
Use Case 2: E-Commerce Store
A Drupal Commerce store selling products with complex attributes.
Auto-discovered facets (no configuration needed):
- Price slider from JSON-LD Product schema
- Currency filter
- Product categories from JSON-LD breadcrumbs
Custom field mappings:
| Drupal Field | Meta Tag Name | Solr Type | Facet Label |
|---|---|---|---|
field_brand (entity ref) |
brand |
_s (string) |
Brand |
field_color (taxonomy) |
color |
_sm (string multi) |
Color |
field_size (list) |
size |
_sm (string multi) |
Size |
field_rating (decimal) |
rating |
_f (float) |
Rating |
field_in_stock (boolean) |
stock |
_s (string) |
Availability |
Result: Users browse products with facets for Brand, Color, Size, Price (slider), and Rating (slider). Commerce product prices, SKUs, and categories are extracted automatically from Drupal Commerce entities and JSON-LD.
Use Case 3: News / Media Site
A content-heavy site with thousands of articles across multiple categories.
Custom field mappings:
| Drupal Field | Meta Tag Name | Solr Type | Facet Label |
|---|---|---|---|
field_section (taxonomy) |
section |
_s (string) |
Section |
field_tags (taxonomy) |
tags |
_sm (string multi) |
Tags |
field_published_date (date) |
pub_date |
_dt (date) |
Published |
Auto-discovered facets:
- Author (from article:author meta tag)
- Language (from og:locale)
- Content type (web pages vs attached PDFs)
Result: Users search articles and filter by Section (Politics, Sports, Tech), Tags, Author, date range (last week, last month, custom range), and toggle between web articles and downloadable PDFs.
Use Case 4: University / Knowledge Base
A large documentation or academic site with courses, papers, and resources.
Custom field mappings:
| Drupal Field | Meta Tag Name | Solr Type | Facet Label |
|---|---|---|---|
field_department (taxonomy) |
department |
_s (string) |
Department |
field_course_level (list) |
course_level |
_s (string) |
Level |
field_year (integer) |
year |
_i (integer) |
Year |
field_format (list) |
format |
_s (string) |
Format |
Result: Students search for "machine learning" and filter by Department (Computer Science, Mathematics), Level (Undergraduate, Graduate), Year (slider: 2020-2024), and Format (Lecture Notes, Papers, Videos). Attached PDFs and DOCX files are fully searchable.
Search Tuning
The module gives you full control over search relevance from the Search Tuning tab — no Solr schema knowledge required:
Hybrid AI Search Modes
| Mode | How It Works | Best For |
|---|---|---|
| Union | Match by keywords OR meaning | Broadest results, discovery |
| Keywords Required | Keywords must match, meaning boosts ranking | Traditional search with AI boost |
| Meaning Required | Semantic match required, keywords boost | AI-first, concept-based search |
| Intersection | Both keywords AND meaning must match | Highest precision |
Field Weights
Control how much each field influences ranking:
- Title weight — how much the page title affects relevance
- Description weight — meta description influence
- URL weight — URL path matching influence
- Text weight — full body text influence
- Linked Data weight — JSON-LD structured data influence
Minimum Match
Control how many search words must match:
- Flexible: great for long, natural-language queries — tolerates missing words
- Balanced: recommended default — good precision without being too strict
- Strict: every word matters — for technical or precise searches
- Custom: write your own Solr mm syntax
Additional Controls
- Vector candidate pool (10-1000): how many semantic matches to consider before ranking
- Content quality boost: reward pages with rich, detailed content over thin pages
All Features Included
Everything below is built into the single module — no extra modules, no extra configuration:
| Feature | Description |
|---|---|
| AI Hints | Streaming AI-generated answers above search results, powered by RAG |
| Autocomplete | Typeahead suggestions combining search history and live document matches |
| Spellcheck | "Did you mean?" corrections for typos and misspellings |
| Search Analytics | Built-in dashboard: top queries, no-results tracking, click-through rates |
| Query Elevation | Pin or exclude specific results for any query from the admin UI |
| Persistent Filters | Admin-configured Solr filter queries applied to every search |
| Real-Time Sync | Publish/unpublish/delete a node — search index updates in seconds |
| Browse Mode | Show all results with facets when no search query is entered |
| Document Search | PDFs, DOCX, XLSX, PPTX, ODT automatically crawled and searchable |
| Multilingual | Full multilingual support with automatic locale filtering |
| Commerce Support | Drupal Commerce products with price, brand, category extraction |
| Embeddable Widget | Pre-built search widget embeddable on any page or external site |
| Light & Dark Theme | Configurable theme for the search results page |
| Infinite Scroll | Choose between paginated results or infinite scroll |
Feature Comparison
| Feature | Search API Solr | Core Search | OpenSolr Search |
|---|---|---|---|
| Setup time | Hours/days | Minutes | Under 30 minutes |
| Field mapping | Manual per field | N/A | Automatic + optional custom |
| Schema management | Manual XML editing | N/A | Zero configuration |
| Indexing load on Drupal | Heavy | Heavy | Zero (crawler) or minimal (Ingestion API) |
| Solr 10.x compatible | No (1,000 field limit) | N/A | Yes |
| Faceted search | Separate Facets module | No | Built-in, auto-discovered |
| AI / Semantic search | No | No | Hybrid AI + vector |
| AI-generated answers | No | No | Built-in (RAG) |
| Autocomplete | Separate module | No | Built-in |
| Analytics | No | No | Built-in dashboard |
| Query elevation | Manual XML files | No | Admin UI |
| Document search (PDF) | Tika (complex setup) | No | Automatic |
| Boolean clause errors | Frequent | N/A | Never |
| Real-time sync | Complex hooks | Cron-based | One checkbox (Data Ingestion API) |
| Multilingual | Complex setup | Limited | Automatic |
| Commerce support | Manual mapping | No | Automatic (JSON-LD) |
Getting Started
Ready to get started?
Install the module, connect your account, start a crawl. Search is live in under 30 minutes.
Get the Module on Drupal.org Learn More on OpenSolrThe module is free and open source (GPL-2.0). Your OpenSolr hosting plan includes the crawler, search API, faceting, autocomplete, and spellcheck. AI features (AI Hints, semantic search) are available on higher-tier plans. See pricing.
Need help? Contact us at support@opensolr.com — we will walk you through the entire process.