Solr Migration
Migrating from older Solr to Solr 8.x or 9.x
Server retirement notice in your inbox? Stuck on Solr 3.x, 4.x, 5.x, 6.x or 7.x? Here is the complete, deep-detail guide to moving an Opensolr index to a modern Solr 8 or 9 environment — including the path most customers want: we migrate the configuration for you in under 4 business hours, flat 85 EUR per index, one-time.
Read this before anything else
Opensolr migrates configurations only — we do NOT migrate data. Document data cannot be carried across a major Solr / Lucene version. There is no in-place upgrade path that preserves an existing Lucene index when crossing Solr 5 → 8, 6 → 9, 7 → 9, etc.
You will perform a FULL REINDEX into the new core, from your source-of-truth (database, CMS, file store, crawler, whatever feeds your index in the first place). The old core and the new core run in parallel while you do this — nothing is taken away from you until the old environment is retired on the date stated in your retirement email.
If you do not have a source-of-truth (you only have the old Solr index), there is a fallback dump-and-reload script in section 11, but read the caveats first. The right answer for almost everyone is to point your existing indexer at the new core.
Option 1 · We migrate the config
Flat 85 EUR per index, one-time. Turnaround: under 4 business hours. We provision a new Solr 8 / 9 index, convert your schema.xml + solrconfig.xml, validate it loads, and hand you the new connection details. You then run your reindex into it. Old and new indexes run in parallel until the old environment is retired.
Option 2 · Brand-new index, default config
Create a new index in any region that runs Solr 8 or 9 and accept the default modern schema that ships with it. Point your crawler / Drupal module / WordPress plugin / Search API connector at the new index, repopulate, then delete the old one once you are happy. Zero schema-conversion work because the new index ships with a current config.
Option 3 · Convert config yourself
You convert schema.xml and solrconfig.xml using the conversion tables below, upload the new config zip via the Configuration Files Editor, then reindex. Use this if you have heavily customised configs and want to preserve every tweak yourself.
1. Why migrate now
Solr versions older than 8.x are end-of-life upstream — no security patches, no bug fixes, and the Lucene index format is several majors behind. In practice that means:
- No vector / KNN search. Solr 9 added
DenseVectorFieldand{!knn}. Hybrid Search, AI Hints and AI Reader on Opensolr only work on Solr 9. - Better relevance. BM25 tuning, edismax improvements, the unified highlighter, Learning-to-Rank — all matured significantly between 5.x and 9.x.
- Faster, smaller indexes. PointFields use BKD trees instead of TrieField term tokens. Range queries on numerics and dates run an order of magnitude faster, on-disk size shrinks.
- Security. CVE coverage stops at the version Apache currently supports. Old Solr also ships with default request handlers (e.g.
/replication?command=...) that have known abuse vectors. - Server retirements. Underlying hardware running old Solr versions is being retired. If your index lands on one of these servers, you receive an email from us with the cutoff date and the affected core name(s).
2. Configurations move. Data does not.
A Solr backup is a snapshot of the Lucene segment files (.cfs, .fdt, .tim, .si, segments_N). Lucene reads segments only from the previous major version — Solr 9 reads Lucene 9 and Lucene 8 segments and nothing older. Try to open a Lucene 5 / 6 / 7 segment in Solr 9 and the IndexReader throws IndexFormatTooOldException at startup.
There is therefore no “upgrade in place” path. A migration always means:
- Provision a brand-new core on a Solr 8 / 9 cluster.
- Convert the configuration files (
schema.xml,solrconfig.xml, plus any stopwords / synonyms / protected-words text files) so they load on the target Solr. - Reindex every document from your source-of-truth — the place that originally fed the old index. Database, CMS, file store, crawler, ingestion API, whatever it is.
Step 1 and Step 2 are what we do for you under the 85 EUR Option 1. Step 3 is yours, every time, regardless of which option you pick. The cost-and-time of step 3 depends entirely on how big and how reachable your source-of-truth is.
One exception — if your only source of data is the old Solr index itself (you no longer have the database / CMS / file store that fed it), section 11 has a fallback cursorMark dump-and-reload script. It works most of the time but it is not a real reindex — it cannot regenerate fields whose values are computed at index time. Read the caveats first.
3. Parallel running & cutover plan
There is no rip-and-replace day. The old index and the new index run side by side for as long as you need to be confident in the new one — up to the retirement date communicated in your email.
Timeline shape
- Day 0: we provision the new core (different name, different cluster) and load the converted config. Old core is untouched, still serving production traffic.
- Day 0–N: you reindex into the new core from your source-of-truth. The old core stays writable so production can continue indexing into it during this period if you want.
- Day N: you validate the new core (doc count parity, sample query parity — see section 13), then point your application at the new connection details.
- Day N+x: after a quiet observation window, you delete the old core from the Opensolr control panel. The old server slot is freed and can be retired.
- Hard deadline: the retirement date in the email. Old core stops accepting traffic on that date regardless of where you are in the process — plan accordingly.
During the parallel period your application can either keep talking to the old core only (simplest) or dual-write to both (zero data loss across the cutover). Dual-write only matters if you are still adding documents during the migration window. For static or slowly-changing indexes a single read-write source — the old one — is fine until the moment of cutover.
4. Pre-migration checklist
Whether you migrate yourself or hand it off to us, gather this first:
- Source core name. e.g.
my_core. - Current Solr version. Visit
https://YOUR-CLUSTER/solr/CORE/admin/system?wt=json— thelucene.solr-spec-versionfield tells you exactly. Or check the Index Control Panel → Information tab. - Doc count.
/select?q=*:*&rows=0&wt=json→response.numFound. Drives how long YOUR reindex will take. - Source-of-truth confirmation. Where does the data come from? Database table, CMS API, crawler, ingestion API, file directory? Confirm it is reachable and complete BEFORE migration starts.
- Custom field types in
schema.xml. Anything beyond stock types — customsolr.TextFieldchains, ICU collations, custom tokenizers, BBoxField, ExternalFileField — needs an explicit conversion decision. - Custom request handlers in
solrconfig.xml. Custom/selectdefaults,/suggestsetups, custom spellcheck dictionaries, customrequestParsertweaks. - Stopwords / synonyms / protected words. Every additional
.txtfile referenced from the schema. - Application config locations. Where is the connection string?
settings.php,wp-config.php, environment variable, Helm value? Cutover speed depends on this being trivial to flip.
Concrete commands for inventory and snapshot:
# Inventory the indexes you have on the affected cluster curl -u opensolr:YOUR_API_KEY \ "https://opensolr.com/solr_manager/api_get_user_cores?email=YOU@EXAMPLE.COM" # Download every config file for the affected core (zip) curl -u opensolr:YOUR_API_KEY \ -o conf-CORE_NAME.zip \ "https://opensolr.com/solr_manager/api_download_config_files/CORE_NAME?email=YOU@EXAMPLE.COM" # Take a backup snapshot of the OLD core (archive only - cannot be restored across major versions) curl -u opensolr:YOUR_API_KEY \ "https://opensolr.com/solr_manager/api_create_backup/CORE_NAME?email=YOU@EXAMPLE.COM"
5. Option 1 · We migrate the config (85 EUR/index, < 4 business hours)
For most customers this is the right call. You email support@opensolr.com with the index name and target Solr major (8 or 9). We send a payment link for 85 EUR per index, one-time. Within under 4 business hours of payment we deliver:
- A new Opensolr index provisioned on a Solr 8 / 9 cluster (any region you choose). Different name from the old one so the two can coexist.
- Your
schema.xmlconverted: every TrieField → PointField withdocValuesas appropriate, every removed-in-modern-Solr element rewritten or replaced (full table in section 8), schema version bumped,versionfield added if missing. - Your
solrconfig.xmlconverted:luceneMatchVersionbumped, master/slave → leader/follower terminology, removed handlers stripped, cache classes updated to Caffeine (full table in section 9). - Stopwords / synonyms / protected-words files carried forward unchanged.
- Connection details delivered to you: hostname, port, path, HTTP basic auth username and password.
- The old core stays untouched and continues to serve production traffic until you flip the application over.
What happens next is on you: you reindex into the new core from your source-of-truth (section 11), validate it (section 13), then point your application at it. Both indexes coexist for as long as you need — up to the retirement date.
The 4-business-hours target covers any standard config — including custom analyzer chains, language-specific tokenizers, custom request handlers, and edismax tuning. Genuinely exotic cases (custom Java plugins / JARs, non-standard distributed-search topologies) we quote separately, but those are rare.
6. Option 2 · Brand-new index with default config
If your old schema is essentially the stock Opensolr / Drupal / WordPress / Search API Solr config, you do not need a custom conversion. Create a new index and accept the modern default schema. Zero schema-conversion work.
- Create a new Opensolr index in a region that runs Solr 8 or 9 (visible in the version dropdown when you click Create New Index). Pick a non-colliding name — e.g. add a
_v2suffix; if you want vector / hybrid search, a__densesuffix. - Update your client to point at the new core:
- Drupal Search API Solr: Configuration → Search and metadata → Search API → your server → edit → change Solr core / path. Then
drush search-api-solr:get-server-config& upload to the new index. - Drupal Opensolr Search module: Settings tab → Save & Connect against the new index. Auto-uploads its own config set.
- WordPress Opensolr Search plugin: Settings tab → Save & Connect against the new index name.
- Web Crawler: create a fresh crawl job pointing at the new core; it populates from the start URL.
- Direct API / SolrJ / pysolr / SolrClient: change the connection string in your application config.
- Drupal Search API Solr: Configuration → Search and metadata → Search API → your server → edit → change Solr core / path. Then
- Reindex from source-of-truth. Trigger your normal indexing pipeline against the new core. Verify
numFoundmatches expectations; run sample queries. - Cut over. When satisfied, delete the old index from the control panel.
Why this is the simplest path: the new index ships with a fully Solr-9-compatible schema and solrconfig out of the box — PointFields, the dense-vector field if you pick __dense, leader/follower terminology, unified highlighter pre-configured.
7. Option 3 · Convert config yourself
If you have heavily-customised schema.xml / solrconfig.xml and want to preserve every tweak yourself, here is the procedure. Every step below is mechanical — the conversion tables in sections 8–9 do the heavy lifting.
- Download the current config zip via the API endpoint shown in section 4.
- Convert
schema.xmlusing the table in section 8. Bump<schema>version to1.7. Addversionif missing. - Convert
solrconfig.xmlusing the table in section 9. Bump<luceneMatchVersion>to your target Solr version. - Create a new Opensolr index and upload the converted zip via the Configuration Files Editor — it auto-validates. If anything is wrong, the upload is rejected and the live index is unchanged.
- Reindex from source-of-truth (section 11).
- Validate & cut over (section 13).
8. schema.xml — full conversion table
Every cell below is something that will break if you carry it forward unchanged. PointFields are not a drop-in for TrieFields — the analyzer disappears, the term encoding is different, and docValues="true" is required if you want to facet, sort, or group.
Side-by-side example of the most common conversions:
<!-- BEFORE: Solr 5.x / 6.x / 7.x schema.xml --> <fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/> <fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/> <fieldType name="float" class="solr.TrieFloatField" precisionStep="0" positionIncrementGap="0"/> <fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" positionIncrementGap="0"/> <fieldType name="date" class="solr.TrieDateField" precisionStep="0" positionIncrementGap="0"/> <fieldType name="location" class="solr.LatLonType" subFieldSuffix="_coordinate"/> <!-- AFTER: Solr 8.x / 9.x schema.xml --> <fieldType name="pint" class="solr.IntPointField" docValues="true"/> <fieldType name="plong" class="solr.LongPointField" docValues="true"/> <fieldType name="pfloat" class="solr.FloatPointField" docValues="true"/> <fieldType name="pdouble" class="solr.DoublePointField" docValues="true"/> <fieldType name="pdate" class="solr.DatePointField" docValues="true"/> <fieldType name="location_rpt" class="solr.SpatialRecursivePrefixTreeFieldType" geo="true" distErrPct="0.025" maxDistErr="0.001" distanceUnits="kilometers"/>
9. solrconfig.xml — full conversion table
Worked example of the replication handler conversion:
<!-- BEFORE: legacy solrconfig.xml --> <luceneMatchVersion>5.1.0</luceneMatchVersion> <requestParsers enableRemoteStreaming="true" multipartUploadLimitInKB="2048000"/> <requestDispatcher handleSelect="true"> ... </requestDispatcher> <requestHandler name="/replication" class="solr.ReplicationHandler"> <lst name="master"> <str name="replicateAfter">commit</str> <str name="confFiles">schema.xml,stopwords.txt</str> </lst> <lst name="slave"> <str name="masterUrl">http://leader-host.example.com:8983/solr/CORE_NAME</str> <str name="pollInterval">00:00:60</str> </lst> </requestHandler> <!-- AFTER: Solr 9.x solrconfig.xml --> <luceneMatchVersion>9.4.0</luceneMatchVersion> <requestParsers multipartUploadLimitInKB="2048000"/> <!-- enableRemoteStreaming removed --> <requestDispatcher> ... </requestDispatcher> <!-- handleSelect removed --> <requestHandler name="/replication" class="solr.ReplicationHandler"> <lst name="leader"> <str name="replicateAfter">commit</str> <str name="confFiles">schema.xml,stopwords.txt</str> </lst> <lst name="follower"> <str name="leaderUrl">http://leader-host.example.com:8983/solr/CORE_NAME</str> <str name="pollInterval">00:00:60</str> </lst> </requestHandler>
10. Dates, spatial, docValues — the three sharpest edges
Date format must be UTC with literal Z suffix
Solr 9 DatePointField rejects timezone offsets like +02:00 at insert time. Older Solr was lenient and silently converted them. The Opensolr crawler and Data Ingestion API have already been updated to convert dates correctly. If you have your own indexer, update it:
// Solr 9 pdate REQUIRES UTC with the literal Z suffix. // Timezone offsets like +02:00 are REJECTED at insert time. // WRONG (Solr 9 will throw): $doc["upload_date_dtm"] = "2026-04-02T15:11:07+02:00"; // RIGHT - convert in PHP before pushing: $dt = new DateTime($input); $dt->setTimezone(new DateTimeZone("UTC")); $doc["upload_date_dtm"] = $dt->format("Y-m-d\TH:i:s\Z");
DocValues required for facet / sort / group on PointFields
TrieFields stored facet/sort information in their indexed terms. PointFields do not — they store BKD points only. To facet, sort, group, or use function queries on a numeric or date field, you must add docValues="true" on the fieldType (or on every <field> entry). Without it, every facet on that field returns zero buckets and Solr logs “can not sort on a field which is not indexed nor has doc values”.
Spatial: LatLonType is gone
If you used solr.LatLonType with a subFieldSuffix, replace it with solr.LatLonPointSpatialField. Stored values are still "lat,lon" string format on input. {!geofilt} and geodist() still work. Hidden _coordinate subfields are gone — rewrite any query that referenced them.
11. How YOU reindex into the new core
This is the part of the migration we cannot do for you. Reindexing means feeding documents into the new core from the same source that originally fed the old one. Concretely:
- If you use the Opensolr Web Crawler: create a fresh crawl job on the new core. Same start URL, same mode, same threads. The crawler will populate it from scratch — it does not need the old index at all.
- If you use the Drupal Opensolr Search module: in the new index settings, click Save & Connect to bind the module to the new core, then run Ingest All Now from the Data Ingestion tab. Real-time sync hooks will keep it current after that.
- If you use the WordPress Opensolr Search plugin: same pattern — Save & Connect against the new index, then bulk-ingest from the Data Ingestion tab.
- If you use Drupal Search API Solr: upload the new server config zip, point the Search API server at the new index, then
drush search-api:reset-tracker+drush search-api:index. - If you have a custom indexer (cron job, message queue consumer, ETL script): change the connection string and run it against the new core. The shape of the docs you push is unchanged.
- If you push via the Data Ingestion API: change the target index name in your client; the API endpoint URL stays the same.
Fallback: dump from old core, replay into new core
Only consider this if your source-of-truth is genuinely gone (or it would take longer to rebuild than the retirement deadline allows). The script below paginates the old core via cursorMark, strips two internal fields that must not be re-imported (_version_, _root_), and POSTs each batch into the new core. Requires jq.
Caveats — read before using this script:
- Only fields where
stored="true"on the OLD schema can be dumped. Index-only fields are gone. - Fields the new schema does not declare are silently dropped — or rejected, depending on
schemalesssettings. - Computed-at-index-time fields (copyField targets, language detection, sentiment, embeddings, custom analyzers) cannot be reproduced this way — they will be missing or stale.
- Block-join parent/child relationships are lost (the
_root_field is stripped). - Vector embeddings from any old
embeddingsfield will not be regenerated — if you want hybrid search you still need to re-embed from text.
A real reindex from source-of-truth fixes all of this. The dump script does not.
# OPTIONAL FALLBACK ONLY - if you have NO source-of-truth. # Read the warnings in section 11 first. Recommended path is to re-ingest from your CMS / DB / file store. OLD="https://old-cluster.example.com/solr/OLD_CORE" NEW="https://new-cluster.example.com/solr/NEW_CORE" AUTH_OLD="opensolr:OLD_INDEX_API_KEY" AUTH_NEW="opensolr:NEW_INDEX_API_KEY" BATCH=2000 CURSOR="*" while : ; do RESP=$(curl -s -u "$AUTH_OLD" \ --data-urlencode "q=*:*" \ --data-urlencode "rows=$BATCH" \ --data-urlencode "sort=id asc" \ --data-urlencode "fl=*" \ --data-urlencode "wt=json" \ --data-urlencode "cursorMark=$CURSOR" \ "$OLD/select") DOCS=$(echo "$RESP" | jq -c '.response.docs | map(del(._version_, ._root_))') [ "$DOCS" = "[]" ] && break curl -s -u "$AUTH_NEW" -H "Content-Type: application/json" \ -X POST -d "$DOCS" \ "$NEW/update?commitWithin=10000" NEXT=$(echo "$RESP" | jq -r .nextCursorMark) [ "$NEXT" = "$CURSOR" ] && break CURSOR="$NEXT" done # Final hard commit on the new core curl -s -u "$AUTH_NEW" "$NEW/update?commit=true"
Why strip _version_: Solr uses it for optimistic concurrency control. Carrying old values forward causes spurious 409 conflicts on the new core. Let Solr assign fresh ones.
Why strip _root_: Block-join parent linkage that is meaningless without simultaneously reindexing the children in the same payload.
For very large indexes (50M+ docs): split the cursor loop into N parallel workers by partitioning on the first character of id. Each worker runs the same script with an extra fq=id:0* filter. Throughput scales close to linearly until the destination Solr write side saturates.
12. Application-side changes
- SolrJ: bump to
9.xor8.11.xmatching the server. Mixed major versions over HTTP work for basic ops but fail on streaming expressions and SolrCloud collection ops. - Drupal Search API Solr: regenerate the config zip via
drush search-api-solr:get-server-config <server>, upload via the Configuration Files Editor on the new index. Search API knows about leader/follower terminology and emits the right configset for the target Solr major. - Drupal Opensolr Search module: point at the new index name in Settings → Save & Connect. The module owns its config set and re-uploads the right one for the destination Solr version.
- WordPress Opensolr Search plugin: same — new index name in Settings, Save & Connect handles the rest.
- Custom
/selectqueries withqt=: replace with calls to/HANDLERdirectly —handleSelectdispatch is going away. - Custom queries with
defaultOperator: passq.op=ANDper request, or set in handler defaults. - Custom queries that reference
defaultSearchField: passdf=fieldnameper request. - Range queries on numerics: PointField range syntax (
field:[1 TO 10]) is identical at query time. No client change needed.
13. Validation & cutover
- Doc count parity. Hit
/select?q=*:*&rows=0on both cores.numFoundshould match exactly. Differences are usually new docs added to the old core during the parallel period — rerun the reindex if you need parity. - Spot-check sample documents. Pull 20–50 random IDs and compare every field side by side.
/select?q=id:DOC_ID&wt=json&fl=*on both, diff the JSON. - Top-N relevance check. Pick 10–20 of your most common production queries (from analytics, or just the ones you care about). Hit both cores, compare top 10 results. Order may differ slightly because of BM25 tweaks between versions — that is normal.
- Faceting smoke test. Run any production query that uses faceting against the new core. If a numeric or date facet returns zero buckets where it used to return many, you forgot
docValues="true"on a PointField — fix in schema, reload core, reindex. - Cut over. Point your application at the new core. Keep the old core running and read-only for a quiet observation window (24–48h is plenty for most apps), then delete from the control panel. Old server slot is freed.
- Hard deadline. The retirement date communicated in your email is the absolute cutoff. Old core stops serving on that date regardless — plan accordingly.
14. Quick Q&A
Will I lose data?
No. The old core stays untouched throughout. We only delete it after you confirm the new one is good. Both run in parallel until the retirement date.
How long does the config migration take?
On our side: under 4 business hours from receipt of payment, regardless of index size. The reindex on YOUR side depends on your data volume and source-of-truth speed — minutes for small sites, hours-to-days for large catalogs.
Is there downtime?
No. The old core stays live until you flip your application config. Cutover is a single config change on your side, instantaneous.
Why can't you migrate the data too?
Lucene index segments are not forward-compatible across major versions. There is no way to read Solr 5/6/7 segments with Solr 9. Apache Solr itself does not offer this. Reindex from your source-of-truth is the universally correct answer.
Can I keep the same index name?
Yes — once the new core is validated and you delete the old one, we can rename the new one to the original name. Application config does not need to change at that point.
Will my queries break?
Common edismax / dismax / lucene syntax is identical. Things to verify: defaultSearchField / defaultOperator assumptions (now per-request), faceting on numerics (need docValues), highlighting params with hl.q wrappers.
What if I have replicas?
We rebuild the leader/follower setup on the new environment and replicate the converted config to every follower automatically.
What about my custom plugins / JARs?
Custom Java plugins must be recompiled against the target Lucene/Solr version. Send the source / JAR — we quote separately if recompilation is needed. This is rare and outside the 85 EUR flat.
Can I get vector / hybrid / AI search after migrating?
Yes. Ask us to give the new index a __dense suffix — that opts into the dense-vector schema and unlocks Hybrid Search, AI Hints, and AI Reader on tailored plans. See Hybrid Search.
Where do my old backups go?
Existing backups stay attached to the old core until the old environment is retired. We can copy specific snapshots elsewhere on request, but they are not restorable cross-major-version — only useful as an archive.
What if I miss the retirement deadline?
Reach out to support@opensolr.com as early as possible. We will work with you on a reasonable extension. The cutoff date in the email is the planned retirement, not a hostile deadline — nobody loses access without warning.
What if I don't have a source-of-truth anymore?
Use the cursorMark dump-and-reload script in section 11 as a fallback. Read the caveats first — computed fields, embeddings, and parent/child links cannot be reproduced this way. The dump is a copy of the stored fields, not a true reindex.
Ready to migrate?
Reply to your retirement email, or send us the index name and target Solr major. Config migration in under 4 business hours, flat 85 EUR per index, one-time. You then reindex from your source — both indexes run in parallel until the old environment retires.
Email support@opensolr.com →