Opensolr Changelog — Recent Updates & Improvements

Data Ingestion Apr 9, 2026

Improved Data Ingestion queue dashboard redesigned for scale — indexes and jobs now load progressively on click instead of all at once. Per-index progress dialog shows real-time status for each index separately. Handles thousands of jobs without performance degradation.
Improved Improved Data Ingestion enrichment pipeline — embeddings now use structured text (text_t) when available for richer semantic vectors. Sentiment analysis uses first 10 complete sentences instead of just title + description. Document size field now computed automatically.

Improved Full CJK and UTF-8 safety across the entire enrichment pipeline. Japanese, Chinese, emoji, and accented characters no longer crash batch processing. All text is sanitized before embedding, Solr push, and JSON encoding. One malformed byte in a PDF can never kill an entire batch again.
Improved Ingestion API now skips individual duplicate documents instead of rejecting the entire batch. One duplicate URI no longer blocks 49 other documents from being indexed. The Ingestion Queue also auto-refreshes every 20 seconds when jobs are pending.

Improved Cleaner error display in the ingestion queue table. Error columns now show a short summary like "30 ok, 20 doc(s) failed — click for details" instead of the full raw error. Click the message to open the Job Detail modal with the complete breakdown per document.
Improved Detailed Solr error reporting in the Data Ingestion Queue. When a document fails at the Solr level — unknown field, type mismatch, schema violation — the exact error from Solr is captured and shown in the Job Detail modal. No more guessing why a document was rejected.