Select a category on the left, to get your answers quickly
What is the Opensolr Traffic Bandwidth Limit:
The Traffic Bandwidth Limit, is there, instead of limiting your usage, by the number of requests.
At Opensolr, we recognize that there may be those use cases where you could have a lot of smaller requests, and it's not fair to bill and charge for the number of those requests, but rather, this will charge you (on a pre-paid plan), on the number of outgoing bytes that are being sent from your Opensolr Index, to your website or application. Therefore, 1 Gb of Traffic could mean 1 million requests (if you know how to optimize your requests), or it could be 1 request (if you don't know how to optimize your requests).
If you are wondering about your Search Traffic Bandwidth Usage, it is most likely, that a Bot, or even an attack has reached the search pages throughout your website, and this way, they will pass that traffic on to our servers here at Opensolr.
Solution:
First, read here, to learn about a few ways to save traffic bandwidth.
Also, read our Best Practices Guide, which may also help with saving traffic bandwidth.
Opensolr, transparently logs every single request made to our Solr infrastructure.
This means, that you get full access to see all requests, either via our Automation API, or via the Opensolr Index Control Panel.
Examples:
If you were using Solr's DataImport Handler, starting with Solr 9.x that is no longer possible.
Here's how to write a small script that will import data into your Opensolr Index, from XML files:
#!/bin/bash USERNAME="<OPENSOLR_INDEX_HTTP_AUTH_USERNAME>" PASSWORD="<OPENSOLR_INDEX_HTTP_AUTH_PASSWORD>" echo "Starting import on all indexes..." echo "" echo "Importing: <YOUR_OPENSOLR_INDEX_NAME>" echo "Downloading the xml data file" wget -q <URL_TO_YOUR_XML_FILE>/<YOUR_XML_FILE_NAME> echo "Removing all data" curl -s -u $USERNAME:$PASSWORD "https://<YOUR_OPENSOLR_INDEX_HOSTNAME>/solr/<YOUR_OPENSOLR_INDEX_NAME>/update?commit=true&wt=json&indent=true" -H "Content-Type: text/xml" -d "*:*" echo "" echo "Uploading and Importing all data into <YOUR_OPENSOLR_INDEX_NAME>" curl -u $USERNAME:$PASSWORD "https://<YOUR_OPENSOLR_INDEX_HOSTNAME>/solr/<YOUR_OPENSOLR_INDEX_NAME>/update?commit=true&wt=json&indent=true" --progress-bar -H "Content-Type: text/xml" --data-binary @<YOUR_XML_FILE_NAME> | tee -a "/dev/null" ; test ${PIPESTATUS[0]} -eq 0 echo "" rm -rf <YOUR_XML_FILE_NAME> echo "Done!" echo "" echo "" echo ""
Now, the way this is made, is that if you have a minimal tech background, you can understand that everything within the <> brackets will have to be replaced with your Opensolr Index Name, your Opensolr Index Hostname, the URL for your XML file, and so forth. You can get all that info in your Opensolr Index Control Panel. Except for the URL to your XML file, which that is hosted somewhere on your end.
The way you format your XML file, is the classic Solr format.
This article may should show you more about the Solr XML Data File format.
Solr will often use quite a lot of RAM Memory, in order to build the search results response.
Therefore it is important that we follow a few Best Practices, in order to ensure that we do not overuse any resources that would not otherwise be needed.
It often happens that a dedicated Opensolr Environment, with quite an extensive amount of RAM Memory, won't be able to handle rather small Solr Indexes, because of the wrong implementation of certain Solr schema.xml configuration parameters, and other requests that will cause Solr to be killed by the Solr OOM script, when the Environment runs out of memory.
Also, Opensolr has a self-healing processes that will kick in for any crashed Solr process, recovering the Solr service in just under 1 minute.
Here are some Best Practices that you can use, to mitigate these effects:
<field name="name" docValues="true" type="text_general" indexed="true" stored="true" />
<field name="description" type="text_general" indexed="true" stored="true" docValues="true" termVectors="true" termPositions="true" termOffsets="true" storeOffsetsWithPositions="true" />
<filterCache size="1" initialSize="1" autowarmCount="0"/>
UPDATE - July, 09 2024:
This is now being set in your solr.xml file that resides in your solr home directory (where your cores / collections are):
<solr>But, you could also apply the stuff below...
<int name="maxBooleanClauses">90589</int>
<shardHandlerFactory name="shardHandlerFactory"
class="HttpShardHandlerFactory">
<int name="socketTimeout">${socketTimeout:600000}</int>
<int name="connTimeout">${connTimeout:60000}</int>
</shardHandlerFactory>
</solr>
If you get the Solr error: "too many boolean clauses", please try to check your synonyms.txt, stopwords.txt or protwords.txt, and try to make those files smaller.
In other words, Solr is trying to apply boolean clauses for each one of those words that are found in any of those files, depending on your schema.xml configuration.
A quick fix, is to remove some of the synonyms from synonyms.txt or other words from the other txt files shown here, and/or you can also take a look at your schema.xml and make sure that your synonyms, stopwords and protwords are configured properly in the chain of tokenizers and filters for your fieldType definitions.
Also, try not to apply synonyms.txt at indexing time, as that replaces many of the original words with their synonyms, considerably increasing the size of your index, and also make search far less relevant in some cases.
Here's an example setup for the synonyms.txt usage, in a text_general field, that we use for our Web Crawler Site Search solution:
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" protected="protwords.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="1"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" protected="protwords.txt" language="English"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymGraphFilterFactory" ignoreCase="true" synonyms="synonyms.txt" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="0" generateNumberParts="1" protected="protwords.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="0"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" protected="protwords.txt" language="English"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
You can learn more here.
Enabling spellcheck in Apache Solr is a useful feature that allows you to provide suggestions for misspelled or incorrect search queries. To enable spellcheck in Solr, you need to configure your Solr schema, Solr configuration files, and query parameters. Here's a step-by-step guide on how to do it:
schema.xml
) located in your Solr core's conf
directory.TextField
type, for example:
<fieldType name="textSpell" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<field name="content" type="textSpell" indexed="true" stored="true"/>
<field name="spell" type="textSpell" indexed="true" stored="false" multiValued="true"/>
solrconfig.xml
) located in your Solr core's conf
directory.<requestHandler>
configuration section for your search endpoint (e.g., /select
) and add the spellcheck component to it. You should also configure other parameters as needed.
<requestHandler name="/select" class="solr.SearchHandler">
<!-- ... -->
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
solrconfig.xml
file, configure the spellcheck component. You can define its settings under the <searchComponent>
section.
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<lst name="spellchecker">
<str name="name">default</str>
<str name="field">spell</str>
<str name="classname">solr.DirectSolrSpellChecker</str>
<str name="distanceMeasure">internal</str>
<float name="accuracy">0.5</float>
<int name="maxEdits">2</int>
<int name="minPrefix">1</int>
<int name="maxInspections">5</int>
<int name="minQueryLength">3</int>
<float name="maxQueryFrequency">0.5</float>
</lst>
</searchComponent>
spellcheck
parameter to your query:
/select?q=your_query&spellcheck=true
spellcheck
section.By following these steps, you should be able to enable spellcheck in Apache Solr and provide search query suggestions for misspelled terms. Make sure to adjust the configuration parameters according to your specific use case and requirements.
For any questions that look like those below:
Why am I not getting results for query A, in AJAX, but I am getting results for query A, without AJAX?
Why am I not getting results for query A?
Why am I not getting results for query B?
For all of the above questions, you should refer to the online Solr documentation, or, the Drupal Community (if using Drupal).
Opensolr, provides the Solr as a Service platform.
Solr search results are not under the responsability of Opensolr, but rather, the way queries will work, depends solely on your Solr implementation, or the implementation of the CMS system you are using.
Please be advised that, your Opensolr Index may fail to reload, when using AnalyzingInfixSuggester
It turns out, that Drupal, is exporting the Solr Configuration zip archive erroneously.
Basically, you will need to manually edit solrconfig_extra.xml, in order to explicitly specify a separate folder for each suggester dictionary.
You can click here to learn more, from the Bug reported to the Drupal Community.
UPDATE: As of Feb 08 2023, the new Opensolr Config Files Uploader, should take care of these dependencies automatically, so the steps below should not be necessary.
However, if you still run into issues, you can try the steps below:
There is often the case (as it is with drupal), that your config files will contain files like schema_extra.xml, or solrconfig_extra.xml
In this case, your main schema.xml and solrconfig.xml will contain references to various fields and types that are defined in those extra files.
Therefore, you need to split your config files archive into multiple archives, and upload them as follows:
- First upload the extra files (zip up the schema_extra.xml and other *_extra.xml files and upload that zip first)
- Second upload the main schema.xml file, along with all other resource files, such as stopwords.txt, synonyms.txt, etc.
- Third, upload a zip archive that contains solrconfig.xml alone.
Solr works with a set of multiple configuration files.
Each Solr configuration file, has it's own purpose.
Therefore, in some cases, some publishers (CMS systems, etc), will chose to create their own structure for such Solr configuration files, such as, it is the case with Drupal, and maybe WordPress (WPSOLR), and others.
When uploading your solr configuration files, using your Opensolr Index Control Panel, it is, therefore, important to upload your files in a specific order:
So, basically, you should simply create those 3 archives and upload them separately, in this exact order, and then everything should work.
You can, of course automate this, by using the Automation REST API to upload your config files.
If you get the error:
Undefined field _text_
Please make sure to open up solrconfig.xml in your Opensolr Control Panel Admin UI and remove the reference to the _text_ field under the /update initParams:
<initparams path="/update/**,/query,/select,/tvrh,/elevate,/spell,/browse">
<lst name="defaults"></lst>
<str name="df">_text_</str>
</initparams>
YES, however, it's only active in some servers right now.
Please ask us to install that, or any other plugin solr library, by following the guide here, and we'll be happy to set it up for you.
Sometimes, in the shared opensolr cloud, the data folder may get corrupted, so it can't be read from or written into.
One easy fix for this, is to simply remove your index, and then just create another one, preferably under another name.
If that doesn't work, please contact us, and we'll be happy to fix it up for you.
Also, keep in mind, that there may be more reasons, so please make sure to check your error log, by clicking the Error Log button inside your opensolr index control panel, and keep refreshing that page to make sure the errors you'll see are accurate.
If you do see errors in there, please email them to us, at support@opensolr.com and we'll fix it for you.
To move from using the managed-schema to schema.xml, simply follow the steps below:
In your solrconfig.xml file, look for the schemaFactory definition.If you have one, remove it and add this instead:
<schemaFactory class="ClassicIndexSchemaFactory"/>
If you don't have it just add the above snippet somewhere above the requestHandlers definitions.
To move from using the classic schema.xml in your opensolr index, to the managed-schema simply follow the steps below:
In your solrconfig.xml, look for a SchemaFactory definition, and replace it with this snippet:
<schemaFactory class="ManagedIndexSchemaFactory"> <bool name="mutable">true</bool> <str name="managedSchemaResourceName">managed-schema</str> </schemaFactory>
If you don't have any schemaFactory definition, just paste the above snippet to your solrconfig.xml file, just about any requestHandler definition.
If you usually get an error, such as: Unknown field... Or Missing field, and your schema.xml already contains those fields, make sure you disable the Schemaless mode in solrconfig.xml
Just head on to the Config Files Editor in your opensolr index control panel, and locate a snippet that looks like this:
class="ManagedIndexSchemaFactory"
According to the solr documentation, you can disable the ManagedIndexSchemaFactory as per the instructions below:
To disable dynamic schema REST APIs, use the following for: <schemaFactory class="ClassicIndexSchemaFactory"/>
Also do not forget to remove the entire snippet regarding the ManagedIndexSchemaFactory, so that you won't accidentally use both.
Yes, Opensolr now supports the JTS Topology Suite, by default, which does not come bundled with the default Solr distribution.
It should be enabled in most of our servers and datacenters, however, if you feel that doesn't work for your index, please Contact Support and we'll be happy to enable it for you.
No further setup will be required on your part.
Please go to https://opensolr.com/pricing and make sure you select the analytics option from the extra features tab, when you upgrade your account.
If you can see analytics but no data, make sure your solr queries are correctly formated in the form:
https://server.opensolr.com/solr/index_name/select?q=your_query&other_params...
So, the search query must be clearly visible in the q parameter in order for it to show in analytics.
Here are a few ways to save your Monthy alloted Bandwidth:
There are a couple things you might be able to do to trade performance for index size. For example, an integer (int) field uses less space than a trie integer (tint), but range queries will be slower when using an int.
To make major reductions in your index, you will almost certainly need to look more closely at the fields you are using.
EZcmd.com is a useful set of GeoData and GeoIP utilities.
Here are a few screenshots