Opensolr Changelog
Recent updates and improvements to the Opensolr platform.
Web Crawler Mar 5, 2026
- Improved Faster Playwright rendering in Chrome mode. Pages now complete in ~0.5–1s instead of 2–25s. The old approach waited for all network activity to stop (analytics, trackers, ad pixels), which stalled on busy pages. Now it waits for the DOM, gives JS 500ms to hydrate, and grabs the content.
Web Crawler Mar 4, 2026
- Improved Solr batch indexing is now more reliable during crawls. When a batch insert to Solr fails (e.g. temporary overload or timeout), the documents are kept in the local buffer and retried on the next flush cycle, instead of being silently lost.
Web Crawler Mar 2, 2026
- Improved Smarter Resume for the Web Crawler. Clicking Resume now always launches the crawler, even when the queue appears empty. Previously, the UI would refuse to resume if there were no pages left in the queue — but that is exactly the scenario where Resume needs to work, because the crawler re-discovers new content by re-reading your sitemaps. No more misleading "nothing to resume" messages.
Web Crawler Feb 27, 2026
- Improved Web Crawler indexing is now faster — crawled pages are sent to Solr in larger batches instead of one at a time, reducing round-trip overhead and significantly speeding up the overall indexing process.
- Improved Clicking Resume when the crawler queue is empty now shows a clear message explaining there are no more pages to process, instead of silently doing nothing. It suggests stopping the cron schedule and starting a fresh crawl.
- Improved The crawler status badge now distinguishes between Running (green), Paused (blue), and Stopped (red). When the cron schedule is active but no crawler processes are running, the dashboard shows Paused instead of Running, so you always know the actual state of your crawl.
Web Crawler Feb 25, 2026
- Improved Smarter content extraction — the web crawler now uses a dual-extraction strategy that runs two independent text extraction engines and picks whichever captures more real content. Pages with heavy JavaScript, complex layouts, or framework-rendered content (React, Next.js, Angular, Vue) are now detected and rendered automatically. The result: richer, more complete text in your Opensolr Index, especially for modern web applications.