Hybrid Search

Opensolr Hybrid Search — find answers to your questions

Testing Opensolr AI Search — Vector Search, AI Hints & Doc...

Step-by-Step Guide
Testing Your Opensolr AI Search Engine
Four powerful features ship with every Opensolr Web Crawler index — intent-based Vector Search, instant AI Hints, one-click Document Reader, and hands-on Query Elevation.
CrawlIndexEmbedSolrSearch
Your complete AI search pipeline — fully managed, out of the box
Intent-Based Vector Search
Instead of matching exact keywords, vector search understands what you mean. A query like "winter hat" finds wool beanies, fleece earflap caps, and knit headwear — even when those exact words aren't on the page. Opensolr uses BGE-m3 embeddings (1024 dimensions) combined with traditional BM25 scoring for the best of both worlds: semantic understanding plus keyword precision.
winter hatAIBGE-m31024-dimensional vector embeddings98%Wool Winter Cap94%Knit Beanie Set89%Fleece Earflap Hat
Hybrid Scoring (BM25 + Vectors)BGE-m3 1024-dimMultilingual
AI Hints — Instant Answers from Your Content
Before your users even scroll through results, AI Hints delivers a concise, AI-generated answer right at the top of the page. It uses RAG (Retrieval-Augmented Generation) — the AI retrieves the most relevant passages from YOUR indexed content, then generates a focused answer. No hallucinations, no external data — every hint is grounded in your actual pages.
best pellet heater for garage?RAG: retrieves from YOUR indexed contentAI HintLook for 40,000+ BTU models with thermostatVentilation required for enclosed spacesSee top-rated pellet heaters in results below
RAG-PoweredGrounded in Your DataZero Hallucinations
Document Reader — Summarize Any Search Result
Every search result includes a "Read" button. Click it, and the AI reads the entire web page, extracts the key information, and generates a clean summary — in seconds. You can then download the summary as a PDF. No need to visit the page, skim through ads, or parse dense content yourself.
Best Pellet Heaters 2026 — Expert ReviewsComplete guide to choosing the right pellet heater...heatersguide.com/pellet-heaters-2026ReadAIReaderPage SummaryTop 5 pellet heaters ranked by efficiency, noise level,and value. Castle 12327 rated best overall at $1,299...Download PDF
One-Click SummariesPDF ExportKey Feature Extraction
Query Elevation — Pin & Exclude Search Results
Take full control of what your users see. Query Elevation lets you pin important results to the top or exclude irrelevant ones — directly from the Search UI, with zero code and no reindexing required. Perfect for promoting landing pages, burying outdated content, or curating high-value queries.
Search ResultsProduct Landing Pageyoursite.com/products/best-sellerPin↑ Pinned #1— forced to top for this queryDrag to reorder when multiple results are pinnedExcluded result — hidden from this query
  • Pin — Force a specific result to the top for a given search query
  • Exclude — Hide a result completely so it never appears for that query
  • Exclude All — Apply the rule globally, across every search query
  • Drag & drop — Reorder pinned results to control exactly which one shows first
Zero Code RequiredExclude Irrelevant ResultsPin & Reorder

Try It Live

Test these demo search engines with real vector search. Use conceptual, intent-based queries:

Try these conceptual queries to see how vector similarity goes beyond keyword matching:

  • climate disasters hurricanes floods wildfires
  • space exploration mars colonization economy
  • ancient microbes life beyond earth

Every demo page includes built-in dev tools — query parameter inspector, full Solr debugQuery output, crawl statistics, and search analytics.


Using the Solr API Directly

Direct API access for advanced users — learn more about hybrid search.

Example Solr endpoints (credentials: 123 / 123):

https://de9.solrcluster.com/solr/vector/select?wt=json&indent=true&q=*:*&rows=2
https://fi.solrcluster.com/solr/rueb/select?wt=json&indent=true&q=*:*&rows=2
https://chicago96.solrcluster.com/solr/peilishop/select?wt=json&indent=true&q=*:*&rows=2

Simple Lexical Query

curl -u 123:123 "https://de9.solrcluster.com/solr/vector/select?q=climate+change&rows=5&wt=json"

Pure Vector Query (KNN)

curl -u 123:123 "https://de9.solrcluster.com/solr/vector/select?q={!knn%20f=embeddings%20topK=50}[0.123,0.432,0.556,...]&wt=json"

Replace the vector array with your own embedding from the Opensolr AI NLP API.

Hybrid Query (Lexical + Vector)

curl -u 123:123 "https://de9.solrcluster.com/solr/vector/select?q={!bool%20should=$lexicalQuery%20should=$vectorQuery}&lexicalQuery={!edismax%20qf=content}climate+change&vectorQuery={!knn%20f=embeddings%20topK=50}[0.12,0.43,0.66,...]&wt=json"

Combines traditional keyword scoring with semantic vector similarity — best of both worlds.


Getting Embeddings via Opensolr API

Generate vector embeddings for any text using these endpoints:

function postEmbeddingRequest($email, $api_key, $core_name, $payload) {
    $apiUrl = "https://api.opensolr.com/solr_manager/api/embed";
    $postFields = http_build_query([
        'email'      => $email,
        'api_key'    => $api_key,
        'index_name' => $core_name,
        'payload'    => is_array($payload) ? json_encode($payload) : $payload
    ]);

    $ch = curl_init($apiUrl);
    curl_setopt_array($ch, [
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_POST           => true,
        CURLOPT_POSTFIELDS     => $postFields,
        CURLOPT_HTTPHEADER     => ['Content-Type: application/x-www-form-urlencoded'],
        CURLOPT_TIMEOUT        => 30,
    ]);

    $response = curl_exec($ch);
    curl_close($ch);
    return json_decode($response, true);
}

The response includes the vector embedding array you can pass directly to Solr.


Code Examples

PHP PHP

<?php
$url = 'https://de9.solrcluster.com/solr/vector/select?wt=json';
$params = [
    'q'            => '{!bool should=$lexicalQuery should=$vectorQuery}',
    'lexicalQuery' => '{!edismax qf=content}climate disasters',
    'vectorQuery'  => '{!knn f=embeddings topK=50}[0.12,0.43,0.56,0.77]'
];

$ch = curl_init($url);
curl_setopt($ch, CURLOPT_USERPWD, '123:123');
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($params));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
curl_close($ch);

echo $response;

Py Python

import requests
from requests.auth import HTTPBasicAuth

url = "https://de9.solrcluster.com/solr/vector/select"
params = {
    'q': '{!bool should=$lexicalQuery should=$vectorQuery}',
    'lexicalQuery': '{!edismax qf=content}climate disasters',
    'vectorQuery': '{!knn f=embeddings topK=50}[0.12,0.43,0.56,0.77]',
    'wt': 'json'
}

response = requests.post(url, data=params, auth=HTTPBasicAuth('123', '123'))
print(response.json())

JS JavaScript (AJAX)

<script>
fetch('https://de9.solrcluster.com/solr/vector/select?wt=json&q={!knn%20f=embeddings%20topK=10}[0.11,0.22,0.33]', {
    headers: { 'Authorization': 'Basic ' + btoa('123:123') }
})
.then(r => r.json())
.then(console.log);
</script>

Quick Reference

  • Adjust topK to control how many similar results to retrieve (usually 20-100).
  • Use {!bool should=...} for softer relevance mixing — vector similarity has more influence on ranking.
  • For best hybrid results, always combine both lexical and vector queries.
  • All demo search pages include built-in query inspector, debugQuery, crawl stats, and search analytics.
Ready to Add AI Search to Your Site?
Get a fully managed vector search engine with AI Hints and Document Reader — set up in minutes.
Read Full Answer

Search Tuning — Per-Index Relevancy Controls

Search Relevancy
Search Tuning — Fine-Tune How Your Search Ranks Results
Every index is different. A news site needs freshness. A product catalog needs exact matches. A knowledge base needs semantic understanding. Search Tuning gives you visual controls to shape relevancy per index — no code, no config files, instant effect.
DEFAULT SETTINGS
wireless headphones review
Sony WH-1000XM5 Review#1
Posted yesterday — comprehensive review...
Best Wireless Headphones 2024#2
Old roundup from 2 years ago...
Headphone Buying Guide#3
AFTER TUNING
wireless headphones review
Sony WH-1000XM5 ReviewFRESH
Posted yesterday — boosted by freshness...
Headphone Buying Guide#2
Semantically relevant — matched by meaning...
Best Wireless Headphones 2024#3
Older content ranked lower...
Freshness boost + semantic balance pushed the new review up and demoted stale content.

Where to Find It
Search Tuning lives inside Index Settings in your Opensolr dashboard. Open your index, click the gear icon or Index Settings, and expand the Search Tuning section. Every change saves automatically — move a slider, and your very next search uses the new settings.

The Six Controls

Field Weights
Control how much each field contributes to relevancy ranking. The four searchable fields are Title, Description, URI, and Text (full body). Use the master slider to quickly shift between title-focused ranking (great for navigational queries) and text-focused ranking (great for deep content search).
5.0
Title
4.0
Description
0.5
URI
0.01
Text
Default: Title 5.0, Description 4.0, URI 0.5, Text 0.01 — title-heavy. Drag the master slider right to give body text more influence, or type exact values into each field.
Freshness Boost
How much newer content is preferred over older content. Higher values push recently published or updated pages toward the top. Uses the document's creation_date field with a time-decay curve — recent documents get the biggest boost, which fades over days and weeks.
Range: 10 (barely noticeable) to 1000 (aggressively fresh). Default: 100. Only applies when search mode is set to "Fresh" — standard search ignores this setting.
Minimum Match
How many of the user's search words must appear in a document for it to be considered a match. Three presets:
Flexible
Show more results
Some words can be missing
Balanced
Most words must match
Good middle ground
Strict
All words must match
Fewest but most precise
Default: System-managed (adapts automatically for vector indexes). Choose a preset to override.
Semantic vs Keyword Balance
Controls how much weight goes to semantic (vector) understanding versus exact keyword matching. Only available on vector-enabled indexes (those with embeddings in the schema). Move left for keyword-heavy results, right for semantic-heavy.
More Keyword
More Semantic
Range: 0.0 (pure keyword) to 3.0 (heavily semantic). Default: 1.5 — balanced. The system also adapts dynamically based on query length (longer queries get more semantic weight), but your override takes priority.
Result Quality Threshold
The minimum relevance score a document must reach to appear in results. Raise it to filter out weak matches and show only highly relevant results. Lower it to be more inclusive and show everything that has some match.
Range: 0.0 (show everything) to 1.0 (only near-perfect matches). Default: 0.60 — filters out low-relevance noise while keeping useful results.
Results Per Page
How many search results are returned in each page. Applies to both the Opensolr Search UI and API responses. Higher values show more results but increase response size.
Range: 10 to 200. Default: 50. Adjust based on your UI layout — grid layouts work well with 20-30, list layouts with 50+.

How It Works — Under the Hood

1
You move a slider
Change any control in the Search Tuning panel. The value saves automatically after a 400ms debounce — no Save button needed.
2
Stored per index
Your custom value is saved to your index configuration. A NULL value means "use system defaults" — so resetting a control removes the override entirely.
3
Applied on next search
When a search request comes in, the engine loads your custom values and applies them as overrides on top of the system defaults. No reindexing, no restart. The very next query uses your tuning.

Reset Behavior

Every control has its own Reset button that restores it to the system default. There's also a Reset All to Defaults button at the bottom of the panel that clears all customizations at once.

Reset Individual Control
Click the Reset button next to any control. The value goes back to system default and the override is removed from your index. System defaults include adaptive behavior — for example, vector indexes automatically adjust semantic weight based on query length.
Reset All to Defaults
Clears every custom value at once. Your index goes back to behaving exactly like it did before you opened Search Tuning. All adaptive behaviors are restored.

Quick Recipes

News Site Prioritize fresh articles
Set Freshness Boost to 500-800. Set Minimum Match to Flexible. Leave field weights at defaults — titles already have the highest weight, and news articles have strong titles.
Knowledge Base Semantic understanding first
Set Semantic vs Keyword to 2.0-2.5 (more semantic). Set Minimum Match to Flexible. Set Field Weights — increase Text weight to 1.0+ so body content has more influence. Freshness doesn't matter for evergreen docs, keep it low (10-30).
E-Commerce Exact product matches
Set Minimum Match to Strict — users searching for "blue wireless headphones" should see results with all three words. Keep Semantic at 1.0-1.5 so typos still work. Set Result Quality Threshold to 0.70+ to cut weak matches. Results Per Page at 20-30 for grid layouts.
Blog / Content Site Deep content discovery
Increase Text field weight to 0.5-1.0 (use the master slider toward "Text-focused"). Set Freshness at 100-200 for moderate recency bias. Minimum Match on Balanced. Semantic at 2.0 for natural-language queries that blog readers tend to use.

Defaults at a Glance

Control Default Value Range
Title Weight 5.0 0 – 20
Description Weight 4.0 0 – 20
URI Weight 0.5 0 – 20
Text Weight 0.01 0 – 20
Freshness Boost 100 10 – 1,000
Minimum Match System-managed Flexible / Balanced / Strict
Semantic vs Keyword 1.5 0.0 – 3.0
Result Quality Threshold 0.60 0.0 – 1.0
Results Per Page 50 10 – 200

FAQ

Do I need to reindex after changing tuning settings?
No. Search Tuning controls are applied at query time, not index time. Your changes take effect on the very next search request.
What happens if I don't customize anything?
Everything stays at system defaults. The search engine uses battle-tested defaults that work well for most use cases, including adaptive behavior for vector indexes that adjusts parameters based on query length.
Does Semantic vs Keyword show up for all indexes?
No. It only appears on vector-enabled indexes — those using the embeddings field for semantic search. Non-vector indexes use pure keyword search, so the control isn't shown.
Does Freshness Boost always apply?
Only when the user searches with Fresh mode enabled (the "Fresh" toggle on the search UI, or fresh=yes in the API). Standard search does not apply freshness boosting regardless of this setting.
Can I set different tuning for different indexes?
Yes — that's the whole point. Every index has its own Search Tuning settings. A news index can have high freshness and flexible matching, while a product index on the same account has strict matching and low freshness. Each index is tuned independently.

Ready to Tune Your Search?
Open Index Settings in your dashboard and expand Search Tuning. Changes take effect on the very next search.
Read Full Answer

Opensolr Web Crawler vs Algolia — Why Opensolr Is the Comp...

📖 New here? For the complete step-by-step setup guide with screenshots, see Opensolr Web Crawler — Full Platform Guide →
Comparison
Opensolr Web Crawler vs Algolia
You need search on your website — or you have data that needs to be searchable. Here's what that actually takes with each platform, and why Opensolr gives you a complete search engine while Algolia gives you an API and a to-do list.
How Opensolr Works — From Zero to Full Search in Minutes 1 Create Index Name your index with __dense suffix for vector search support 2 Paste Your URL Enter your website URL. Configure scope, depth and follow rules (or just use defaults) 3 Start Crawl Click Start. Multi-threaded crawler with JS rendering handles HTML, PDF, DOCX, Excel, PPT and more 4 Monitor Crawl Stats show progress, pages crawled, errors and status codes What You Get — Included, No Extra Cost Hybrid Search Vector + Lexical + RRF 3 tunable modes AI Search Hints Streaming LLM answers from your own content Full Search UI Dark/light, facets, scroll mobile-ready, embed code Query Elevation Pin and exclude results per query or globally Analytics Top queries, zero-results click tracking and CTR JS Rendering Auto-detects React, Next Angular, Vue and more 21 File Formats PDF, DOCX, XLSX, PPTX ODT, RTF, MSG and more Price Extraction Auto-extracts prices with range slider Spellcheck "Did you mean?" + vector semantic understanding Data Ingestion API Push JSON or upload files up to 50 docs per batch Dedup Protection URI-based document identity auto-rejects duplicates Rich Text Extraction PDF, DOCX, PPTX, ODT auto-extracted via API

All included. Fixed monthly price. No per-query charges. No per-record fees. Crawl your website, push data via API, or both. Same index, same search, same everything.

What It Actually Takes to Get Search Working
The real comparison isn't features — it's effort.
Algolia
Steps to get search on your website
1Sign up for an Algolia account
2Read their API documentation
3Structure your content as JSON records
4Write code to push records via their API
5Build a frontend UI with InstantSearch widgets
6Configure ranking and relevance rules
7Set up analytics (paid add-on on some plans)
8Write update scripts when content changes
9Hope your bill doesn't spike next month
Developer required. Weeks of integration work.
Opensolr Web Crawler
Steps to get search on your website
1Create an Opensolr Index (add __dense suffix)
2Paste your website URL
3Click Start Crawl
Done.
Full hybrid search, AI hints, analytics, elevation — all live.
No developer needed. Minutes, not weeks.
Optional: tune scope, schedule recrawls, customize embed code, pin results, read analytics — but none of that is required to be up and running.
Fixed price. No surprises. No per-query tax.
Feature-by-Feature Breakdown
Everything Opensolr includes out of the box — no add-ons, no extra cost.
1 Zero-Code Web Crawling
Opensolr
  • Paste your URL, click Start — that's it
  • Multi-threaded crawler with intelligent JS rendering
  • Three-tier rendering pipeline: curl-cffi, httpx, Playwright headless Chromium
  • Auto-detects SPAs (React, Next.js, Angular, Vue, Nuxt, SvelteKit, Gatsby)
  • Crawls 21 MIME types: HTML, PDF, DOCX, XLSX, PPTX, ODT, ODS, ODP, RTF, MSG, and more
  • Robots.txt obedience, spider trap detection, sitemap following
  • Configurable scope: domain, subdomain, path, or full web
  • Scheduled recrawls (hourly, daily, weekly)
Algolia
  • No built-in crawler — you must write code to push records via their API
  • Algolia Crawler exists as a separate paid product with limited features
  • You structure your data as JSON and maintain push scripts
  • When your content changes, you update and re-push — manually or via custom scripts
  • No document format extraction — want PDF search? Build your own pipeline
2 True Hybrid Search — Three Modes, Full Control
Opensolr
  • Vector mode: Normalized weighted sum of lexical + vector scores with tunable weights and log normalization
  • RRF mode: Reciprocal Rank Fusion — two separate requests merged mathematically for the best of both worlds
  • Solr mode: Lexical-first search with vector reranking — precision-focused
  • 1024-dimensional multilingual embeddings (50+ languages) on title and description fields
  • KNN cosine similarity on dense vectors
  • Per-field boost weights, phrase matching multipliers, minimum match tuning
  • Typos? The vector model understands what you meant, not just what you typed — semantic understanding makes traditional typo tolerance look primitive
  • Plus spellcheck with "Did you mean?" suggestions on top of that
Algolia
  • "NeuralSearch" exists but it's a black box — no control over modes, weights, or normalization
  • No user-tunable hybrid parameters
  • No choice between search strategies
  • Typo tolerance is good, but it only handles character-level errors — it doesn't understand meaning
  • Cannot tune field weights, phrase boosting, or minimum match
3 AI-Powered Search Summaries
Opensolr
  • Streaming AI hints powered by a GPU-accelerated LLM
  • Context-aware: sends top results (title + description + content) to the LLM
  • Real-time Server-Sent Events streaming directly in the search UI
  • Answers appear as the user searches — no extra clicks, no separate page
  • Built on your own indexed content, not hallucinated from training data
Algolia
  • No built-in LLM integration
  • To get AI summaries, you'd build your own RAG pipeline on top of Algolia
  • That means another service, another API, another bill
4 Complete Search UI — Ready to Embed
Opensolr
  • Full themed search page with dark and light modes
  • Infinite scroll or traditional pagination
  • Faceted navigation (language, locale, source, custom facets)
  • OG image previews, favicons, content type icons
  • Mobile-responsive out of the box
  • Configurable via URL parameters — no code needed
  • One-line embed code: drop an iframe and you're done
Algolia
  • Provides InstantSearch.js widget library for React, Vue, Angular
  • YOU assemble the UI from components
  • More flexible for developers, but far more work for everyone else
  • No ready-to-embed, zero-code search page
5 Full Analytics Suite — Built In, Not Upsold
Opensolr
  • Query Analytics: Top queries, daily trends, query length distribution, CSV export
  • Zero-Results Dashboard: Every zero-result query tracked by unique IP — find your content gaps
  • Click Analytics with CTR: Track which results get clicked, click-through rates per query, detect low-CTR queries that need better results
  • Bulk management: Select and delete junk/test queries across all tabs
  • All included in every plan
Algolia
  • Analytics exists but is a paid add-on on higher tiers
  • Click analytics requires additional client-side integration code
  • No zero-result tracking out of the box
  • You pay more to understand how your own users search
6 Query Elevation — Pin, Exclude and Curate Results
Opensolr
  • Pin specific documents to the top for specific queries
  • Exclude documents from appearing for specific queries
  • Global wildcard rules that apply to ALL queries
  • Visual elevation bar directly on the search results page
  • One-click pin/exclude while browsing results — no context switching
Algolia
  • Has "Rules" for pinning and hiding results
  • But the Rules UI is separate from search results — you can't pin while searching
  • No global wildcard rules
  • More cumbersome workflow for result curation
7 Automatic Price Extraction and Filtering
Opensolr
  • Crawler automatically extracts prices from JSON-LD, microdata, and meta tags
  • Price range slider in the search UI (no code needed)
  • Sort by price (ascending, descending, or by relevance)
  • Currency detection and display
  • Works for e-commerce sites out of the box
Algolia
  • You must manually structure price data in your JSON records
  • No automatic extraction from web pages
  • Price faceting available but requires manual schema design
8 21 Document Formats — Crawled and Indexed Automatically
Opensolr
  • HTML, PDF, DOCX, XLSX, PPTX, ODT, ODS, ODP, RTF, MSG (Outlook email), plain text, XML, RSS, JSON
  • Full text extraction with metadata preservation
  • Document reader lets users view extracted content inline without leaving search results
  • The crawler handles everything — you don't convert, parse, or pre-process anything
Algolia
  • Only indexes JSON records you push via API
  • Want to search PDFs? Build a PDF extraction pipeline yourself
  • Want to search Word documents? Same story
  • Every non-HTML format is your problem to solve
9 Sentiment Analysis and Language Detection
Opensolr
  • VADER sentiment scoring on every crawled page (positive, negative, neutral, compound)
  • Language detection via langid (50+ languages)
  • Language and locale facets in the search UI
  • All automatic — no configuration needed
Algolia
  • No sentiment analysis
  • Basic language detection but nothing automatic or enriching
10 Spellcheck, Stemming and Text Analysis
Opensolr
  • "Did you mean?" spellcheck suggestions
  • Edge n-grams for instant prefix matching (autocomplete)
  • ASCII folding for accent-insensitive search (cafe = café)
  • Stemming and synonym support
  • And on top of all that, vector search that understands meaning regardless of exact spelling
Algolia
  • Typo tolerance is solid (one of Algolia's strengths)
  • But it's a black box — no tuning available
  • No vector-level semantic understanding of typos
11 URL Exclusion and Content Control
Opensolr
  • Exclude specific URL patterns from search results via search.xml config
  • Regex-based exclusion patterns
  • Combined with Query Elevation for full result curation
Algolia
  • No URL exclusion mechanism — you'd remove records via API calls
  • Content control is code-driven, not configuration-driven
12 Predictable Pricing — No Per-Query Tax
Opensolr
  • Fixed monthly pricing — search all you want
  • No per-search-request charges
  • No per-record-per-month charges
  • Everything included: crawling, hybrid search, AI hints, analytics, elevation, search UI
  • Your bill this month is the same as next month
Algolia
  • Charges per search request ($1/1,000 searches on some plans)
  • Charges per record per month
  • Analytics, AI, and advanced features are paid add-ons
  • A traffic spike can make your bill jump 5-10x overnight
  • Algolia's pricing page is deliberately confusing — good luck figuring out your actual cost before you're committed
See our plans: Opensolr Pricing
13 Your Data, Your Infrastructure
Opensolr
  • Data lives on dedicated Solr clusters
  • No vendor lock-in — standard Apache Solr under the hood
  • Master-replica architecture with automatic failover
  • Full Solr API access to your index
  • Can migrate to self-hosted Solr at any time — your schema, your data, your rules
Algolia
  • Proprietary engine — your data is in their cloud, in their format
  • Migration out requires rebuilding everything from scratch
  • No standard API compatibility with anything else
  • You're locked in the moment you integrate
14 Data Ingestion API — Push Any Data Into Your Index Live
Opensolr
  • POST JSON payloads with up to 50 documents per batch
  • Or upload a .json file — ideal for large batches from CMS exports or data pipelines
  • URI-based document identity: every document needs a URI, and the document ID is always md5(uri). Same URI = same document. Resubmit to update.
  • Automatic dedup protection: duplicate URIs already in the queue are rejected before processing
  • Rich text extraction: set rtf: true and pass a URL to a PDF, DOCX, PPTX, ODT, XLSX, or RTF — Opensolr extracts the text for you
  • Full Solr error reporting per document — know exactly which document failed and why
  • Returns doc_ids array in every response for tracking
  • All the same features: dense vectors, hybrid search, AI hints, query elevation, analytics, the full search UI
  • Complete PHP, Python and cURL examples in the documentation
Algolia
  • JSON record push via API — this is their only method of getting data in
  • No file upload — you must always construct and send JSON programmatically
  • No built-in rich text extraction — want to index a PDF? Extract it yourself first
  • No URI-based dedup — you manage document identity and deduplication in your own code
  • No integrated queue or job status — you build your own pipeline
  • Per-record-per-month charges on top of everything else
The Bottom Line
Algolia is an API. You still need to build everything around it — the crawling, the extraction, the UI, the AI layer. Opensolr is the entire search engine, ready to go. Crawl your website with zero code, push structured data via the Data Ingestion API, or do both at once — and you get hybrid vector search, AI summaries, analytics, query elevation, rich text extraction, dedup protection, and a complete search UI — all for a fixed monthly price with no per-query surprises.

For the price of a pizza, you get what would take a team of developers weeks to build on top of Algolia.
Read Full Answer

Loading more articles...