Complete Guide to the OpenSolr Search Drupal Module

Drupal
Complete Guide

The OpenSolr Search Drupal Module

AI-powered search for Drupal 10 & 11 — zero configuration, automatic facets, hybrid vector search. Replace any Drupal search solution in under 30 minutes.


Why Switch to OpenSolr Search?

Whether you are using Drupal's built-in core search, Search API Solr, Elasticsearch, or any other search solution, there are compelling reasons to move to the OpenSolr Search Drupal Module:

The Solr 10.x Problem

Solr 10 introduces a hard limit of 1,000 schema fields. This is a breaking change for any Drupal site that relies on Search API Solr, because the module creates separate Solr fields for every Drupal field, in every language, in every index. A multilingual site with 3 languages, 50 content fields, and 2 indexes can easily exceed 1,000 schema fields — making Search API Solr fundamentally incompatible with Solr 10.x on multilingual sites.

The OpenSolr Search module is not affected by this limit. It uses the web crawler approach — content is indexed as flat text and metadata extracted from HTML, not mapped field-by-field to the Solr schema.

The Boolean Clause Problem

Search API Solr generates excessively complex queries that routinely hit the maxBooleanClauses safety limit (default 1,024). A simple 5-word search can expand into thousands of boolean clauses because the module queries every mapped field with boosting, synonyms, and phrase variations. Solr 10.x is removing the ability to override this limit entirely.

Complexity vs. Simplicity

Search API Solr requires you to manually map every Drupal field to Solr, maintain complex schema configurations, install separate modules for facets, autocomplete, and highlighting, and troubleshoot silent failures when configs drift. The OpenSolr Search module replaces all of this with a single module that works out of the box.

Since this is a Drupal module, this guide focuses primarily on users migrating from Search API Solr — the most common Drupal search backend — but the installation and configuration steps apply regardless of what you are replacing.

WHY OPENSOLR SEARCHTraditional Drupal Search(Search API Solr, Core Search, etc.)Manual schema + field mapping per languageHeavy indexing load on Drupal server1,024 boolean clause limit errorsSolr 10.x: 1,000 field limit breaks multilingualSeparate modules for facets, autocomplete, etc.No AI / semantic search capabilityOpenSolr Search ModuleSingle module, zero configurationZero schema management — crawler handles itCrawler or Data Ingestion API — your choiceClean queries — no clause limit issues everSolr 10.x ready — not affected by field limitFacets, autocomplete, analytics all built inHybrid AI + vector + semantic searchFuture-proof your Drupal search before Solr 10.x makes legacy approaches obsolete.


How It Works

The OpenSolr Search module offers two equally powerful indexing methods. You can use one or both simultaneously:

Method 1: Web Crawler

The OpenSolr Web Crawler visits your site over HTTPS, reads every published page, and indexes all content automatically — exactly like Google does. The module injects structured <meta> tags and JSON-LD data on your pages so the crawler can extract rich, facetable fields.

What the crawler discovers automatically on every page:

  • All Open Graph meta tags (og:title, og:description, og:image, og:type, og:locale)
  • All Twitter Card meta tags
  • All JSON-LD structured data — every schema.org type: Article, Product, Recipe, Event, Course, JobPosting, FAQPage, HowTo, BreadcrumbList, and hundreds more
  • All Microdata attributes (itemprop, itemtype, itemscope)
  • Product data: prices, currencies, brands, categories, SKUs (from JSON-LD, Open Graph, and microdata)
  • Full page text with structure preserved (headings, paragraphs, lists, tables)
  • Attached documents: PDFs, DOCX, XLSX, PPTX, ODT — full text extraction
  • Language and locale detection
  • Sentiment analysis (positive/negative/neutral scores)
  • Vector embeddings for semantic/AI search (BGE-m3, 1024 dimensions)

HOW THE CRAWLER INDEXES YOUR SITEYour SiteDrupal 10/11with HTTPSWeb CrawlerReads HTML meta tagsReads JSON-LD dataExtracts full textExtracts documentsAI ProcessingVector embeddingsSentiment analysisLanguage detectionContent quality scoreSolr IndexSearchable+ FacetableAuto-Discovered Facetable Fieldsog:typeog:localecategoryauthorprice + currencycontent_type+ every custom meta tag you define via the module's Field Mapping tab

Method 2: Data Ingestion API

The Data Ingestion API pushes content directly from Drupal to Solr — no crawler involved. This is fully automated inside the module:

  • Real-time sync (one checkbox): When you create, update, unpublish, or delete a node in Drupal, the search index updates within seconds. No cron required — it happens on entity save.
  • Bulk ingestion: One-click "Ingest All Now" button queues your entire site for processing. Drupal cron sends documents in configurable batches (500-5,000 per cron run, 50 per API call).
  • Field mappings create Solr fields automatically: When you map a Drupal field in the Field Mapping tab (e.g., field_colorcolor_sm), the Data Ingestion API sends that field value directly to Solr as a typed, facetable field. No schema editing. The field exists in Solr the moment the first document is ingested.
  • Handles 500K+ documents with constant memory per batch — no memory spikes, no timeouts.
  • Works for private sites: Intranets, password-protected sites, and sites behind firewalls that a crawler cannot reach.
  • Commerce product support: Prices, brands, SKUs, and taxonomy category hierarchies are extracted and pushed automatically.
  • Multilingual: Each translation is ingested separately with the correct locale, language tags, and translated field values (including taxonomy term labels).

Both methods use the same Field Mapping configuration — map a field once, and it works whether content is crawled or ingested. The crawler reads the mapped values from <meta> tags injected on the page; the Data Ingestion API reads them directly from Drupal entity fields and pushes them to Solr.


Installation & Setup

UP AND RUNNING IN 6 STEPS — UNDER 30 MINUTES1Installcomposer requiredrupal/opensolr_search2ConnectEnter API keySelect region3Select ContentPick content types+ commerce products4CrawlStart crawl scheduleAutomatic indexing5ConfigureMap fields + facetsTune search6Go LiveRedirect search URLDisable old search

Step 1: Install the Module

composer require drupal/opensolr_search
drush en opensolr_search

The module has zero conflicts with Search API, core search, or any other search module. You can run them side by side during transition.

Step 2: Connect to OpenSolr

Go to Administration > Configuration > Search and metadata > Opensolr Search (/admin/config/search/opensolr).

Enter your OpenSolr account email and API key (from your OpenSolr dashboard), select a server region closest to your users, and click Save & Connect. The module automatically creates and configures a vector-enabled Solr index for you — no schema setup needed.

Step 3: Select Content Types

In the Data Crawler tab, select which Drupal content types and Commerce product types to include in search. Check Include attached files if you want PDFs, DOCX, and XLSX documents to be searchable too.

Step 4: Start Crawling

Click Start Crawl Schedule. The crawler begins indexing your site automatically. You can monitor progress (pages indexed, pages in queue, errors) directly from the Drupal admin.

Crawler settings you can tune:

  • Parallel threads (1-20): how many pages to fetch simultaneously
  • Request delay (0.1-10s): pause between requests to respect server limits
  • Crawl mode: shallow (sitemap only, recommended) or deep (follow all links)

Step 5: Configure Fields, Facets, and Search Tuning

This is where the power is. See the detailed sections below for Field Mapping, Faceted Search, and Search Tuning.

Step 6: Go Live

Your search page is already live at /opensolr-search. The module automatically redirects Drupal's built-in search forms to this page. When you are satisfied:

  1. Redirect any old search URLs to /opensolr-search (via Drupal Redirect module or .htaccess)
  2. Disable and uninstall your previous search modules
  3. Done. No data migration needed — the crawler or Data Ingestion API has already indexed everything independently

Faceted Search — The Complete Guide

This is the most powerful feature of the module. You can create faceted navigation from two sources: fields the crawler discovers automatically, and custom Drupal fields you map manually.

TWO SOURCES OF FACETSAuto-Discovered by CrawlerNo configuration needed — already in your indexCategory (from content type / og:type)Author (from meta author tag)Language / Locale (from og:locale)Price + Currency (from JSON-LD / microdata)Content Type (web page vs document)Custom Drupal Field MappingMap any Drupal field to a facetable Solr fieldTaxonomy terms → String facetsDate fields → Date range pickersNumber fields → Min/Max slidersEntity references → String facetsAny field → Any Solr type

Source 1: Auto-Discovered Fields

The crawler automatically extracts and indexes metadata from every page it visits. These fields are immediately available as facets without any configuration:

Field Source Facet Example
category_s Content type / og:type Article, Page, Blog Post, Product
author_s <meta name="author"> John Smith, Jane Doe
language_s Language detection English, Spanish, French
og_locale_s <meta property="og:locale"> en_us, es_es, fr_fr
price_f JSON-LD Product.offers.price Price slider ($0 — $999)
currency_s JSON-LD priceCurrency EUR, USD, GBP
content_type_s HTTP Content-Type Web pages vs Documents
product_category_s JSON-LD Product.category Electronics > Phones

Plus every JSON-LD property. If your pages have Recipe schema, the crawler indexes recipeCategory, cookTime, recipeYield. If you have Event schema, it indexes eventDate, location, organizer. The crawler handles every schema.org type in existence — Article, Product, Recipe, Event, Course, JobPosting, FAQPage, HowTo, and hundreds more.

Source 2: Custom Drupal Field Mapping

For fields that are not in your HTML metadata (internal Drupal fields like taxonomy terms, custom fields, entity references), the module lets you map any Drupal field to a Solr-facetable field.

Go to the Field Mapping tab in the module admin. For each field mapping you configure:

  1. Drupal Field: Select from all available node and Commerce product fields
  2. Meta Tag Name: The name used in the index (e.g., genre, difficulty, color)
  3. Solr Type: Choose the right type for your data:
Solr Type Use For Facet Widget
_s (String) Taxonomy terms, categories, tags Checkbox list
_sm (String multi-valued) Fields with multiple values Checkbox list
_f (Float) Prices, ratings, scores Min/Max slider
_fm (Float multi-valued) Multiple numeric values Min/Max slider
_i (Integer) Counts, quantities, years Min/Max slider
_im (Integer multi-valued) Multiple integer values Min/Max slider
_dt (Date) Publication date, event date Date range picker
_dtm (Date multi-valued) Multiple dates Date range picker
  1. Display Label: Human-readable label shown in the search sidebar (e.g., "Genre", "Difficulty Level")

The module uses these mappings in both indexing methods: it injects <meta property="opensolr:field_name_type"> tags on your pages for the crawler to read, AND it pushes the mapped field values directly via the Data Ingestion API. Either way, the fields appear in Solr as typed, facetable fields — no Solr schema configuration needed.

Important for multi-valued fields: If your Drupal field has multiple values (e.g., a Tags taxonomy reference with multiple terms), you must use a multi-valued Solr type (_sm, _fm, _im, _dtm). The module validates this automatically.

Configuring Facets

After the crawler has indexed your site, go to the Field Mapping tab. Below the field mapping section, you will see a Facet Configuration table showing all facetable fields discovered in your index, with document counts.

For each field you can:

  • Enable/Disable it as a visible facet in the search sidebar
  • Set the widget type:
    • List (checkboxes) — for string fields like categories, authors, tags
    • Slider (min/max range) — for numeric fields like price, rating
    • Date Range (date picker) — for date fields like publication date
  • Set the display label shown to users
  • Set the display order (weight) — drag to reorder

FACET WIDGET TYPESList (Checkboxes)For categories, tags, authorsElectronics (142)Clothing (89)Home & Garden (67)Sports (45)Users click to filter. Active filtersshown as removable pills above results.Slider (Min/Max)For prices, ratings, scoresPrice Range$25$150Users drag handles to set range.Works with any numeric Solr field.Date Range PickerFor dates, timestampsPublished Between2024-01-01to2024-12-31Calendar date pickers for start/end.Works with any date Solr field.


Concrete Use Cases

Use Case 1: Language Teaching Platform

A Spanish language teaching site with articles, lessons, and resources.

Field Mappings:

Drupal Field Meta Tag Name Solr Type Facet Label
field_level (taxonomy) level _s (string) Level
field_topic (taxonomy) topic _sm (string multi) Topic
field_resource_type (taxonomy) resource_type _s (string) Resource Type
field_target_audience (list) audience _s (string) Audience

Result: Users search for "conjugation exercises" and filter by Level (A1, A2, B1, B2), Topic (Grammar, Vocabulary), and Resource Type (Worksheet, Video, Article). All facets update counts in real time as filters are applied.

Use Case 2: E-Commerce Store

A Drupal Commerce store selling products with complex attributes.

Auto-discovered facets (no configuration needed):

  • Price slider from JSON-LD Product schema
  • Currency filter
  • Product categories from JSON-LD breadcrumbs

Custom field mappings:

Drupal Field Meta Tag Name Solr Type Facet Label
field_brand (entity ref) brand _s (string) Brand
field_color (taxonomy) color _sm (string multi) Color
field_size (list) size _sm (string multi) Size
field_rating (decimal) rating _f (float) Rating
field_in_stock (boolean) stock _s (string) Availability

Result: Users browse products with facets for Brand, Color, Size, Price (slider), and Rating (slider). Commerce product prices, SKUs, and categories are extracted automatically from Drupal Commerce entities and JSON-LD.

Use Case 3: News / Media Site

A content-heavy site with thousands of articles across multiple categories.

Custom field mappings:

Drupal Field Meta Tag Name Solr Type Facet Label
field_section (taxonomy) section _s (string) Section
field_tags (taxonomy) tags _sm (string multi) Tags
field_published_date (date) pub_date _dt (date) Published

Auto-discovered facets:

  • Author (from article:author meta tag)
  • Language (from og:locale)
  • Content type (web pages vs attached PDFs)

Result: Users search articles and filter by Section (Politics, Sports, Tech), Tags, Author, date range (last week, last month, custom range), and toggle between web articles and downloadable PDFs.

Use Case 4: University / Knowledge Base

A large documentation or academic site with courses, papers, and resources.

Custom field mappings:

Drupal Field Meta Tag Name Solr Type Facet Label
field_department (taxonomy) department _s (string) Department
field_course_level (list) course_level _s (string) Level
field_year (integer) year _i (integer) Year
field_format (list) format _s (string) Format

Result: Students search for "machine learning" and filter by Department (Computer Science, Mathematics), Level (Undergraduate, Graduate), Year (slider: 2020-2024), and Format (Lecture Notes, Papers, Videos). Attached PDFs and DOCX files are fully searchable.


Search Tuning

The module gives you full control over search relevance from the Search Tuning tab — no Solr schema knowledge required:

Hybrid AI Search Modes

Mode How It Works Best For
Union Match by keywords OR meaning Broadest results, discovery
Keywords Required Keywords must match, meaning boosts ranking Traditional search with AI boost
Meaning Required Semantic match required, keywords boost AI-first, concept-based search
Intersection Both keywords AND meaning must match Highest precision

Field Weights

Control how much each field influences ranking:

  • Title weight — how much the page title affects relevance
  • Description weight — meta description influence
  • URL weight — URL path matching influence
  • Text weight — full body text influence
  • Linked Data weight — JSON-LD structured data influence

Minimum Match

Control how many search words must match:

  • Flexible: great for long, natural-language queries — tolerates missing words
  • Balanced: recommended default — good precision without being too strict
  • Strict: every word matters — for technical or precise searches
  • Custom: write your own Solr mm syntax

Additional Controls

  • Vector candidate pool (10-1000): how many semantic matches to consider before ranking
  • Content quality boost: reward pages with rich, detailed content over thin pages

All Features Included

Everything below is built into the single module — no extra modules, no extra configuration:

Feature Description
AI Hints Streaming AI-generated answers above search results, powered by RAG
Autocomplete Typeahead suggestions combining search history and live document matches
Spellcheck "Did you mean?" corrections for typos and misspellings
Search Analytics Built-in dashboard: top queries, no-results tracking, click-through rates
Query Elevation Pin or exclude specific results for any query from the admin UI
Persistent Filters Admin-configured Solr filter queries applied to every search
Real-Time Sync Publish/unpublish/delete a node — search index updates in seconds
Browse Mode Show all results with facets when no search query is entered
Document Search PDFs, DOCX, XLSX, PPTX, ODT automatically crawled and searchable
Multilingual Full multilingual support with automatic locale filtering
Commerce Support Drupal Commerce products with price, brand, category extraction
Embeddable Widget Pre-built search widget embeddable on any page or external site
Light & Dark Theme Configurable theme for the search results page
Infinite Scroll Choose between paginated results or infinite scroll

Feature Comparison

Feature Search API Solr Core Search OpenSolr Search
Setup time Hours/days Minutes Under 30 minutes
Field mapping Manual per field N/A Automatic + optional custom
Schema management Manual XML editing N/A Zero configuration
Indexing load on Drupal Heavy Heavy Zero (crawler) or minimal (Ingestion API)
Solr 10.x compatible No (1,000 field limit) N/A Yes
Faceted search Separate Facets module No Built-in, auto-discovered
AI / Semantic search No No Hybrid AI + vector
AI-generated answers No No Built-in (RAG)
Autocomplete Separate module No Built-in
Analytics No No Built-in dashboard
Query elevation Manual XML files No Admin UI
Document search (PDF) Tika (complex setup) No Automatic
Boolean clause errors Frequent N/A Never
Real-time sync Complex hooks Cron-based One checkbox (Data Ingestion API)
Multilingual Complex setup Limited Automatic
Commerce support Manual mapping No Automatic (JSON-LD)

Getting Started

Ready to get started?

Install the module, connect your account, start a crawl. Search is live in under 30 minutes.

Get the Module on Drupal.org Learn More on OpenSolr

The module is free and open source (GPL-2.0). Your OpenSolr hosting plan includes the crawler, search API, faceting, autocomplete, and spellcheck. AI features (AI Hints, semantic search) are available on higher-tier plans. See pricing.

Need help? Contact us at support@opensolr.com — we will walk you through the entire process.