Opensolr Changelog

Recent updates and improvements to the Opensolr platform.

Improved ×

Search Apr 10, 2026

  • Improved Improved hybrid search relevance — in vector hybrid mode, lexical field weight boosts are now automatically zeroed out so that semantic similarity drives the relevance ranking. Previously, even small lexical boosts could inflate scores well above the 0-1 vector range, pushing semantically relevant results down. Field weights still apply when search falls back to pure lexical mode.

Data Ingestion Apr 9, 2026

  • Improved Data Ingestion queue dashboard redesigned for scale — indexes and jobs now load progressively on click instead of all at once. Per-index progress dialog shows real-time status for each index separately. Handles thousands of jobs without performance degradation.

Search Apr 9, 2026

  • Improved Search results count now appears directly below the search bar for immediate feedback. Response time displayed in seconds (e.g. 0.09 s) instead of milliseconds. Price and date facet sliders stay visible even when filters narrow results to a single item.

Data Ingestion Apr 9, 2026

  • Improved Improved Data Ingestion enrichment pipeline — embeddings now use structured text (text_t) when available for richer semantic vectors. Sentiment analysis uses first 10 complete sentences instead of just title + description. Document size field now computed automatically.

Search Apr 4, 2026

  • Improved Price filter topbar slider tooltips now show thousands separators (e.g. 15,638.00 instead of 15638.00). Consistent number formatting across all slider controls, tooltips, and input fields.

Drupal Apr 4, 2026

  • Improved Per-facet minimum count threshold — configure how many documents a facet value needs before it appears. Instant CSS tooltips on facet hover replace slow browser title popups. Facet values preserve original casing from the source data.

Web Crawler Apr 4, 2026

  • Improved Improved web crawler date extraction for sites that lack JSON-LD or meta tags. New targeted extraction looks for dates inside HTML elements with date-related CSS classes (e.g. .date, .posted, .info) — much safer than scanning all page text.

Search Apr 4, 2026

  • Improved Active filter pills now show numbers with thousands separators (e.g. "Price from: 9,507.99" instead of "9507.99"). Applies to both the facet sidebar pills and the topbar price filter pills.
  • Improved Facet slider inputs now display numbers with thousands separators (e.g. 15,637.00 instead of 15637.00) for better readability. Float fields (_f) show 2 decimals, integer fields show none. Min values are floored and max values are ceiled so the range always covers all results.

Website Apr 3, 2026

  • Improved Homepage Drupal card updated — now showcases the Opensolr Search for Drupal module with full-text search, facets, analytics, and AI features. New vector search e-commerce demo added to the live demos section.

Web Crawler Apr 3, 2026

  • Improved Web Crawler now respects the package's vector access flag — indexes without AI features skip embedding generation during crawls, saving GPU resources and speeding up indexing.

Performance & SEO Apr 3, 2026

  • Improved GPU priority queue for AI embeddings — user-facing search requests now always jump ahead of background crawler batch jobs. Prevents slow search during heavy indexing. Includes a configurable timeout so search falls back to lexical instantly if the GPU is busy.

Search Apr 3, 2026

  • Improved Search header now stays fixed at the top while scrolling through results, keeping the search box and controls always within reach.
  • Improved Search transitions now show a loading overlay with animated dots, so you know the search is working — especially helpful on slower connections or large indexes.

Drupal Mar 29, 2026

  • Improved Language selector dropdown now shows the active language first instead of alphabetical order. Crawler Settings section moved above Content Types for better admin UX.
  • Improved Data Crawler and Data Ingestion admin tabs now show live document counts next to each content type — checked types show the real count, unchecked show 0, with a running total that updates instantly as you toggle checkboxes.
  • Improved Opensolr Search for Drupal — full multilingual support for meta tags and JSON-LD structured data. Product categories, tags, and brands now output in the correct page language, so the Web Crawler extracts translated facet values identical to Data Ingestion.

Drupal Mar 28, 2026

  • Improved Opensolr Search for Drupal — added Stop Crawl button, price display with currency symbols, and optimized query embeddings for better search relevance.

API Mar 28, 2026

  • Improved New is_query parameter on the embed and batch embed API endpoints. Set is_query=1 when embedding search queries to get optimized retrieval vectors.

Web Crawler Mar 28, 2026

  • Improved Core-wide thread limit enforcement — the max_threads setting now controls the total number of concurrent crawler processes across all start URLs for an index, not per URL. Setting threads to 1 means exactly 1 process at a time.
  • Improved Smarter description extraction — the crawler no longer picks up CSS, JavaScript, or theme builder garbage as page descriptions. Description priority: meta tags → JSON-LD structured data → first two sentences of extracted text.
  • Improved Crawler settings changes now take effect immediately. When you save new thread count, crawl mode, renderer, or pause settings, the active crawl schedule is automatically updated — no need to stop and restart.

Search Mar 28, 2026

  • Improved Price badges now display currency symbols (€, $, £, ¥, etc.) instead of raw currency codes in search results. Prices also display correctly for all indexes regardless of schema configuration.
  • Improved Improved vector search accuracy — the embedding model now uses instruction-tuned prefixes that optimize query vectors for retrieval. Expect 5–15% better recall on natural language queries and stronger cross-language matching (e.g. searching in Romanian and finding English results).

Drupal Mar 25, 2026

Security Mar 25, 2026

  • Improved 3-round security audit on the Drupal module: XSS protection on AI streaming output (DOMParser sanitizer), Solr injection prevention on range filters and facet values, CSRF on all admin write endpoints, and safe URL generation in CLI/cron context. 35+ issues found and fixed across all module files.

Data Ingestion Mar 25, 2026

  • Improved Full CJK and UTF-8 safety across the entire enrichment pipeline. Japanese, Chinese, emoji, and accented characters no longer crash batch processing. All text is sanitized before embedding, Solr push, and JSON encoding. One malformed byte in a PDF can never kill an entire batch again.
  • Improved Ingestion API now skips individual duplicate documents instead of rejecting the entire batch. One duplicate URI no longer blocks 49 other documents from being indexed. The Ingestion Queue also auto-refreshes every 20 seconds when jobs are pending.

Documentation Mar 23, 2026

Web Crawler Mar 20, 2026

  • Improved Web Crawler now automatically removes documents from the search index when their pages return non-200 status codes (404, 500, etc.) during crawling. Previously, dead pages could remain in search results indefinitely.

Search Mar 18, 2026

  • Improved Search clear button (✕) is now larger and more tappable on mobile — the button is bigger, has a generous tap target with padding, and shows a visual press animation on touch. Easier to clear a query on any device.
  • Improved Fresh mode is now a date window filter — Search Tuning now lets you set a Freshness Window of 2 to 365 days. When "Fresh" mode is selected, only content published within that window is returned — no more recent-but-irrelevant results pushing down the most relevant ones. The old boost-factor approach that could surface off-topic content is replaced by a clean date range filter.

Search Mar 17, 2026

  • Improved Search relevancy defaults updated to Flexible minimum match — queries now return more results by default, especially for longer natural-language searches. Short queries (1-2 words) still require all terms to match, while longer queries allow partial matches for better recall. Per-index Search Tuning overrides are unaffected.

Documentation Mar 15, 2026

Control Panel Mar 13, 2026

  • Improved Reload and Reset error messages in the Error Audit are now human-readable — raw Java stack traces are replaced with a short root-cause summary and a direct link to the Error Log for the full details.
  • Improved Index Reset is now bulletproof — the reset process verifies the index is actually empty after clearing it. If the standard reset fails (locked segments, corrupt index), it automatically falls back to a hard reset that nukes the data directory and rebuilds from scratch. Reset status is now properly reported back to the UI instead of always showing success.

Documentation Mar 8, 2026

  • Improved Updated the Data Ingestion API documentation with full content_type field guidance. The field reference now explains the default behavior, how it controls web vs media display in search results, and how MIME types are auto-detected when using rtf:true. All code examples (cURL, PHP, Python) now include content_type.
  • Improved Comprehensive code examples added to the Data Ingestion API documentation. Full working PHP and Python examples for both submission methods — JSON body and file upload — with error handling and job status polling. Plus updated cURL examples for every workflow.

Data Ingestion Mar 8, 2026

  • Improved Cleaner error display in the ingestion queue table. Error columns now show a short summary like "30 ok, 20 doc(s) failed — click for details" instead of the full raw error. Click the message to open the Job Detail modal with the complete breakdown per document.
  • Improved Detailed Solr error reporting in the Data Ingestion Queue. When a document fails at the Solr level — unknown field, type mismatch, schema violation — the exact error from Solr is captured and shown in the Job Detail modal. No more guessing why a document was rejected.

API Mar 8, 2026

  • Improved The Data Ingestion API now returns a doc_ids array in every successful response, showing the auto-generated document ID (md5 of uri) for each document in your batch. Use these IDs to track, query, or update specific documents in your index.
  • Improved URI is now mandatory for every document in the Data Ingestion API. The document ID is always generated as md5(uri), making the URI the single source of truth for document identity. Same URI = same document. Resubmitting a URI updates the existing document. Duplicate URIs in pending jobs are automatically rejected to prevent accidental double-indexing.

Search Mar 7, 2026

  • Improved The Pin, Exclude, and Exclude All buttons on the search elevation toolbar are now high-contrast and color-coded — orange for Pin, red for Exclude — so they stand out clearly as interactive controls.
  • Improved Elevation actions are now mutually exclusive per document — clicking Pin on an excluded result automatically removes the exclude first, and vice versa. No more stale conflicting rules.
  • Improved Query Analytics & Tools — the former Query Statistics page has been completely redesigned into a clean tabbed application. Overview, Queries, and Elevation Rules each live in their own tab with lazy AJAX loading. Elevation rules now show full document details (title, description, URL) instead of raw Solr IDs, with accordion-style collapsible query groups and a regex search to instantly find any elevated document across all rules.

Control Panel Mar 6, 2026

  • Improved The Add New Index page now uses a sidebar filter panel instead of dropdown menus. Region, Version, Country, Type, and Crawler filters are always visible on the left, with result counts next to each value. Click any value to filter, click it again to clear. Active filters are highlighted and a Clear All link resets everything. On mobile, filters collapse behind a sticky Filters button at the top of the page.
  • Improved The Add New Index page is now more compact and easier to scan. Fonts, cards, and filter controls have all been tightened up so you can see more server options at a glance without scrolling. Each card shows the key details — Solr version, region, and server type — cleanly and without clutter.

Web Crawler Mar 5, 2026

  • Improved Faster Playwright rendering in Chrome mode. Pages now complete in ~0.5–1s instead of 2–25s. The old approach waited for all network activity to stop (analytics, trackers, ad pixels), which stalled on busy pages. Now it waits for the DOM, gives JS 500ms to hydrate, and grabs the content.

Web Crawler Mar 4, 2026

  • Improved Solr batch indexing is now more reliable during crawls. When a batch insert to Solr fails (e.g. temporary overload or timeout), the documents are kept in the local buffer and retried on the next flush cycle, instead of being silently lost.

Web Crawler Mar 2, 2026

  • Improved Smarter Resume for the Web Crawler. Clicking Resume now always launches the crawler, even when the queue appears empty. Previously, the UI would refuse to resume if there were no pages left in the queue — but that is exactly the scenario where Resume needs to work, because the crawler re-discovers new content by re-reading your sitemaps. No more misleading "nothing to resume" messages.

Website Feb 27, 2026

  • Improved Updated the Terms & Conditions and Privacy Policy pages with clearer language, current compliance standards, and improved formatting — making both documents easier to read and navigate.

Index Management Feb 27, 2026

  • Improved The Optimize button in Index Tools is now smarter. It detects whether an optimization is already running and shows you live index stats — segments, index size, document count, and deleted docs — every time you click it. If an optimization is in progress, it shows the current status instead of accidentally starting a second one. No more guessing or clicking Check Progress repeatedly.

Web Crawler Feb 27, 2026

  • Improved Web Crawler indexing is now faster — crawled pages are sent to Solr in larger batches instead of one at a time, reducing round-trip overhead and significantly speeding up the overall indexing process.
  • Improved Clicking Resume when the crawler queue is empty now shows a clear message explaining there are no more pages to process, instead of silently doing nothing. It suggests stopping the cron schedule and starting a fresh crawl.
  • Improved The crawler status badge now distinguishes between Running (green), Paused (blue), and Stopped (red). When the cron schedule is active but no crawler processes are running, the dashboard shows Paused instead of Running, so you always know the actual state of your crawl.

Search Feb 25, 2026

  • Improved Vector search verified and battle-tested — hybrid search (vector + keyword) has been tested across live indexes with real-world queries. Semantic understanding works out of the box: try searching for "how do I download my invoices and upgrade my account?" on opensolr.com and see how it finds the right pages even when no document contains those exact words.

Web Crawler Feb 25, 2026

  • Improved Smarter content extraction — the web crawler now uses a dual-extraction strategy that runs two independent text extraction engines and picks whichever captures more real content. Pages with heavy JavaScript, complex layouts, or framework-rendered content (React, Next.js, Angular, Vue) are now detected and rendered automatically. The result: richer, more complete text in your Opensolr Index, especially for modern web applications.

Search Feb 25, 2026

  • Improved Search result snippets are now shorter and more meaningful. Instead of dumping huge walls of text, the highlighter picks the most relevant sentence around your search terms — cleaner, easier to scan, and actually useful.

Infrastructure Feb 25, 2026

  • Improved Database Import now shows the actual upload size limit read from the server's PHP configuration. If your file exceeds the limit, you'll see the exact value and which PHP-FPM settings to adjust (post_max_size and upload_max_filesize in php.ini) — no more guessing why large imports fail.

Index Management Feb 25, 2026

  • Improved Default worker count increased from 3 to 10 and default batch size from 100 to 200 items per worker, delivering significantly faster indexing out of the box on modern servers.
  • Improved Background mode is now the default — drush ost automatically runs as a background daemon that survives SSH disconnection. No need to add --background anymore. Use --no-background if you want foreground/interactive mode.
  • Improved Search on the My Solr Indexes page now works reliably on mobile devices — the search bar uses native form submission so the Go, Search, and Done buttons on any mobile keyboard (Gboard, SwiftKey, iOS, Samsung) all trigger the search correctly.
  • Improved The Indexes and Clusters list now remembers your search filter in the URL — bookmark or share a filtered view of your indexes, especially useful when managing a large number of indexes.

Analytics Feb 25, 2026

  • Improved Latest Queries now uses server-side pagination — browse through all your requests within any date range without loading everything at once. Filter by query text, HTTP status, or IP address and hit Search to find exactly what you need across all shards.

Website Feb 24, 2026

  • Improved Review badges in footer replaced with crisp inline SVG cards for Trustpilot, Google, and ISO certifications
  • Improved Security compliance section redesigned with individual SVG badge cards for Software Security, EU GDPR, PCI DSS, and Content Security Policy

Index Management Feb 24, 2026

  • Improved Reload, Reset, Commit, and Restart actions now run via AJAX with an elegant modal dialog showing a loading spinner and result status, instead of navigating away from the page
  • Improved Merged Delete By Query and Utilities into a single unified Index Tools panel with a cleaner design
  • Improved Available Backup Files and Restore Actions Log now display side by side for easier overview
  • Improved Redesigned Backup Manager with styled action buttons, emoji icons, and cleaner layout

Infrastructure Feb 24, 2026

  • Improved Adding an IP access rule now shows instant inline feedback and refreshes the rules list automatically
  • Improved Saving HTTP Auth credentials now shows an inline confirmation message without navigating away from the page

Index Management Feb 24, 2026

  • Improved Upload notes are now hidden by default behind an info button in the section header, reducing visual clutter
  • Improved Redesigned Config Files Manager with a cleaner panel layout, styled file selector, and a + toggle button to reveal the new file form inline

Analytics Feb 23, 2026

  • Improved Filter queries (fq) are now extracted and displayed when the main query is a wildcard browse, so you can always see what your users were actually looking for
  • Improved Advanced Solr syntax — nested local params, boost wrappers, vector/hybrid search, function expressions, and dollar-reference variables — is now intelligently parsed and resolved to reveal the real search terms
  • Improved Queries from Drupal, WordPress, and other CMS platforms are now automatically translated into clean, readable strings
  • Improved Analytics queries now load significantly faster with parallel shard processing

Search Feb 21, 2026

Website Feb 20, 2026

  • Improved Better mobile responsiveness across all pages
  • Improved Live search demos now accessible directly from the homepage
  • Improved Step-by-step "How It Works" section to help new users get started quickly

Analytics Feb 20, 2026

  • Improved Faster analytics loading with parallel data processing

Performance & SEO Feb 19, 2026

  • Improved All images now include descriptive alternative text
  • Improved Improved color contrast ratios across the site for better readability
  • Improved Better screen reader support with semantic HTML landmarks
  • Improved Lazy loading for non-critical images
  • Improved Minified CSS and JavaScript for smaller page sizes
  • Improved Reduced layout shift during page load with explicit image dimensions
  • Improved Faster page loads with browser caching on public pages
  • Improved FAQ page headings now show the actual category name
  • Improved Sharing Opensolr links on social media now shows proper preview images, titles, and descriptions

Website Jan 21, 2026

  • Improved Cleaner layout with better readability for plan options and pricing details