Batch vector embeddings
Opensolr API Endpoint: batch_embed
Overview
The batch_embed endpoint allows you to generate vector embeddings for multiple text payloads in a single request — up to 50 items at a time, each up to 50,000 characters. This is significantly faster than calling the single embed endpoint in a loop, because all texts are encoded in a single GPU/CPU batch.
Endpoint URL
https://api.opensolr.com/solr_manager/api/batch_embed
Supports GET and POST requests.
Authentication Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| string | Yes | Your Opensolr registration email address. | |
| api_key | string | Yes | Your API key from the Opensolr dashboard. |
| index_name | string | Yes | Name of your Opensolr index (for authentication). |
Embedding Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| payloads | JSON array | Yes | A JSON array of strings to embed. Max 50 items, each max 50,000 characters. |
You can pass payloads as:
- A GET/POST parameter:
payloads=["text1","text2","text3"] - A JSON body:
{"payloads": ["text1", "text2", "text3"]}
Rate Limits
| Limit | Value |
|---|---|
| Max items per batch | 50 |
| Max characters per item | 50,000 |
| Min characters per item | 2 |
Example Response
{ "embeddings": [ [0.0123, -0.0456, ...], [0.0789, -0.0012, ...], [0.0345, -0.0678, ...] ], "dimension": 1024, "count": 3 }
Quick Test (1-liner)
curl -s "https://api.opensolr.com/solr_manager/api/batch_embed?email=&api_key=LOGIN_FOR_API_KEY&index_name=your_index&payloads=%5B%22hello+world%22,%22opensolr+is+great%22%5D"
PHP Example
<?php $url = "https://api.opensolr.com/solr_manager/api/batch_embed"; $payloads = [ "Distributed search across multiple shards", "Real-time indexing with soft commits", "Faceted navigation for e-commerce catalogs" ]; $params = [ "email" => "your@email.com", "api_key" => "YOUR_API_KEY", "index_name" => "your_index", "payloads" => json_encode($payloads) ]; $ch = curl_init($url); curl_setopt_array($ch, [ CURLOPT_RETURNTRANSFER => true, CURLOPT_POST => true, CURLOPT_POSTFIELDS => http_build_query($params), CURLOPT_TIMEOUT => 120 ]); $response = json_decode(curl_exec($ch), true); curl_close($ch); if (isset($response["embeddings"])) { echo "Got " . $response["count"] . " embeddings of dimension " . $response["dimension"] . "\n"; foreach ($response["embeddings"] as $i => $vector) { echo "Embedding $i: [" . implode(", ", array_slice($vector, 0, 5)) . ", ...]\n"; } } else { echo "Error: " . ($response["error"] ?? "Unknown") . "\n"; }
Python Example
import requests import json url = "https://api.opensolr.com/solr_manager/api/batch_embed" payloads = [ "Distributed search across multiple shards", "Real-time indexing with soft commits", "Faceted navigation for e-commerce catalogs", ] params = { "email": "your@email.com", "api_key": "YOUR_API_KEY", "index_name": "your_index", "payloads": json.dumps(payloads), } response = requests.post(url, data=params, timeout=120).json() if "embeddings" in response: print(f'Got {response["count"]} embeddings of dimension {response["dimension"]}') for i, vec in enumerate(response["embeddings"]): print(f"Embedding {i}: {vec[:5]}...") else: print(f'Error: {response.get("error", "Unknown")}')
JSON Body Example (alternative)
You can also send the payloads as a JSON body:
curl -s -X POST "https://api.opensolr.com/solr_manager/api/batch_embed" \ -H "Content-Type: application/json" \ -d '{ "email": "your@email.com", "api_key": "YOUR_API_KEY", "index_name": "your_index", "payloads": ["hello world", "opensolr is great", "batch embeddings rock"] }'
Solr Schema Requirement
To store batch embeddings in Solr, your schema needs a vector field:
<field name="embeddings" type="vector" indexed="true" stored="false" multiValued="false"/> <fieldType name="vector" class="solr.DenseVectorField" vectorDimension="1024" required="false" similarityFunction="cosine"/>
⚠️ This API uses the BAAI/bge-m3 embedding model which produces 1024-dimensional vectors.
Use Cases
- Bulk preprocessing: Embed hundreds of documents before indexing them into Solr.
- Comparison: Generate embeddings for multiple candidates and compare them using cosine similarity.
- Clustering: Embed a set of texts and feed the vectors into a clustering algorithm.
- Caching: Pre-compute embeddings for frequently searched queries.
See Also
- Create vector embeddings (single payload)
- Create vector embeddings for every document in your Opensolr Index
- Hybrid Search in Opensolr
Support
For more information or help, visit Opensolr Support or use your Opensolr dashboard.