Opensolr API Endpoint: batch_embed
Overview
The batch_embed endpoint allows you to generate vector embeddings for multiple text payloads in a single request — up to 50 items at a time, each up to 50,000 characters. This is significantly faster than calling the single embed endpoint in a loop, because all texts are encoded in a single GPU/CPU batch.
Endpoint URL
https://api.opensolr.com/solr_manager/api/batch_embed
Supports GET and POST requests.
Authentication Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| string | Yes | Your Opensolr registration email address. | |
| api_key | string | Yes | Your API key from the Opensolr dashboard. |
| index_name | string | Yes | Name of your Opensolr index (for authentication). |
Embedding Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| payloads | JSON array | Yes | A JSON array of strings to embed. Max 50 items, each max 50,000 characters. |
You can pass payloads as:
- A GET/POST parameter:
payloads=["text1","text2","text3"] - A JSON body:
{"payloads": ["text1", "text2", "text3"]}
Rate Limits
| Limit | Value |
|---|---|
| Max items per batch | 50 |
| Max characters per item | 50,000 |
| Min characters per item | 2 |
Query vs Document Embedding\n\nSet is_query=1 when embedding search queries (user input at search time). Leave is_query=0 (default) when embedding documents/passages at index time. The model prepends an instruction prefix for queries that optimizes the vectors for retrieval — improving recall by 5–15%, especially on natural language queries and cross-language matching.\n\n---\n\n## Example Response
{ "embeddings": [ [0.0123, -0.0456, ...], [0.0789, -0.0012, ...], [0.0345, -0.0678, ...] ], "dimension": 1024, "count": 3 }
Quick Test (1-liner)
curl -s "https://api.opensolr.com/solr_manager/api/batch_embed?email=&api_key=LOGIN_FOR_API_KEY&index_name=your_index&payloads=%5B%22hello+world%22,%22opensolr+is+great%22%5D&is_query=0"
PHP Example
<?php $url = "https://api.opensolr.com/solr_manager/api/batch_embed"; $payloads = [ "Distributed search across multiple shards", "Real-time indexing with soft commits", "Faceted navigation for e-commerce catalogs" ]; $params = [ "email" => "your@email.com", "api_key" => "YOUR_API_KEY", "index_name" => "your_index", "payloads" => json_encode($payloads),\n "is_query" => "0" // Set to "1" for search queries ]; $ch = curl_init($url); curl_setopt_array($ch, [ CURLOPT_RETURNTRANSFER => true, CURLOPT_POST => true, CURLOPT_POSTFIELDS => http_build_query($params), CURLOPT_TIMEOUT => 120 ]); $response = json_decode(curl_exec($ch), true); curl_close($ch); if (isset($response["embeddings"])) { echo "Got " . $response["count"] . " embeddings of dimension " . $response["dimension"] . " "; foreach ($response["embeddings"] as $i => $vector) { echo "Embedding $i: [" . implode(", ", array_slice($vector, 0, 5)) . ", ...] "; } } else { echo "Error: " . ($response["error"] ?? "Unknown") . " "; }
Python Example
import requests import json url = "https://api.opensolr.com/solr_manager/api/batch_embed" payloads = [ "Distributed search across multiple shards", "Real-time indexing with soft commits", "Faceted navigation for e-commerce catalogs", ] params = { "email": "your@email.com", "api_key": "YOUR_API_KEY", "index_name": "your_index", "payloads": json.dumps(payloads),\n "is_query": "0", # Set to "1" for search queries } response = requests.post(url, data=params, timeout=120).json() if "embeddings" in response: print(f'Got {response["count"]} embeddings of dimension {response["dimension"]}') for i, vec in enumerate(response["embeddings"]): print(f"Embedding {i}: {vec[:5]}...") else: print(f'Error: {response.get("error", "Unknown")}')
JSON Body Example (alternative)
You can also send the payloads as a JSON body:
curl -s -X POST "https://api.opensolr.com/solr_manager/api/batch_embed" \ -H "Content-Type: application/json" \ -d '{ "email": "your@email.com", "api_key": "YOUR_API_KEY", "index_name": "your_index", "payloads": ["hello world", "opensolr is great", "batch embeddings rock"],\n "is_query": "0" }'
Solr Schema Requirement
To store batch embeddings in Solr, your schema needs a vector field:
<field name="embeddings" type="vector" indexed="true" stored="false" multiValued="false"/> <fieldType name="vector" class="solr.DenseVectorField" vectorDimension="1024" required="false" similarityFunction="cosine"/>
⚠️ This API uses the intfloat/multilingual-e5-large-instruct embedding model which produces 1024-dimensional vectors.
Use Cases
- Bulk preprocessing: Embed hundreds of documents before indexing them into Solr.
- Comparison: Generate embeddings for multiple candidates and compare them using cosine similarity.
- Clustering: Embed a set of texts and feed the vectors into a clustering algorithm.
- Caching: Pre-compute embeddings for frequently searched queries.
See Also
- Create vector embeddings (single payload)
- Create vector embeddings for every document in your Opensolr Index
- Hybrid Search in Opensolr
Support
For more information or help, visit Opensolr Support or use your Opensolr dashboard.
This is a premium feature available on custom plans tailored to your needs and budget. For small websites, we can even provide these features for free after validating your use case. Contact us at support@opensolr.com to discuss your requirements.