Opensolr Changelog
Recent updates and improvements to the Opensolr platform.
Web Crawler Apr 4, 2026
- Fix Fixed date indexing errors caused by timezone offsets (e.g.
+03:00) in date fields. All dates are now strictly converted to UTC before sending to Solr. Added a final safety gate — any date that doesn't match the exact Solr format is dropped rather than causing an indexing error.
Web Crawler Mar 28, 2026
- Fix Fixed a bug where the Solr buffer could get permanently stuck after stopping a crawl mid-run. Documents with oversized embedding payloads no longer block the entire batch — payloads are automatically capped, and each batch flush succeeds independently.
Web Crawler Mar 6, 2026
- Fix Document indexing now works reliably for all major office formats. DOCX, DOC, XLSX, XLS, and PPTX files are fully supported with proper text extraction — including tables, headers, footers, and speaker notes. Previously, many documents were indexed with empty text due to format misdetection and encoding issues.
Web Crawler Mar 4, 2026
- Fix Fixed tag field generation during crawl. The tags and title_tags fields used for autocomplete and spellcheck were being stored with raw special characters intact, which could produce noisy or broken suggestions. They are now properly cleaned — special characters stripped, whitespace normalized — so autocomplete and spellcheck results are cleaner.
Web Crawler Mar 2, 2026
- Fix Automatic cleanup of stale crawler lock files. If a previous crawl crashed or was interrupted, leftover lock files could silently prevent the next run from starting — the crawler would launch but do nothing. Resume now detects and removes stale lock files before starting, so scheduled cron runs and manual resumes always work reliably.
Web Crawler Mar 1, 2026
- Fix Fixed the Flush to Solr button in Web Crawler always reporting buffer is empty even when documents were actually flushed. It now correctly reports the number of documents flushed, and automatically commits after flushing so your documents become searchable immediately — no more waiting for the next auto-commit cycle.