Select a category on the left, to get your answers quickly
Solr will often use quite a lot of RAM Memory, in order to build the search results response.
Therefore it is important that we follow a few Best Practices, in order to ensure that we do not overuse any resources that would not otherwise be needed.
It often happens that a dedicated Opensolr Environment, with quite an extensive amount of RAM Memory, won't be able to handle rather small Solr Indexes, because of the wrong implementation of certain Solr schema.xml configuration parameters, and other requests that will cause Solr to be killed by the Solr OOM script, when the Environment runs out of memory.
Also, Opensolr has a self-healing processes that will kick in for any crashed Solr process, recovering the Solr service in just under 1 minute.
Here are some Best Practices that you can use, to mitigate these effects:
<field name="name" docValues="true" type="text_general" indexed="true" stored="true" />
<field name="description" type="text_general" indexed="true" stored="true" docValues="true" termVectors="true" termPositions="true" termOffsets="true" storeOffsetsWithPositions="true" />
<filterCache size="1" initialSize="1" autowarmCount="0"/>
If you get the Solr error: "too many boolean clauses", please try to check your synonyms.txt, stopwords.txt or protwords.txt, and try to make those files smaller.
In other words, Solr is trying to apply boolean clauses for each one of those words that are found in any of those files, depending on your schema.xml configuration.
A quick fix, is to remove some of the synonyms from synonyms.txt or other words from the other txt files shown here, and/or you can also take a look at your schema.xml and make sure that your synonyms, stopwords and protwords are configured properly in the chain of tokenizers and filters for your fieldType definitions.
Also, try not to apply synonyms.txt at indexing time, as that replaces many of the original words with their synonyms, considerably increasing the size of your index, and also make search far less relevant in some cases.
Here's an example setup for the synonyms.txt usage, in a text_general field, that we use for our Web Crawler Site Search solution:
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" protected="protwords.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="1"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" protected="protwords.txt" language="English"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymGraphFilterFactory" ignoreCase="true" synonyms="synonyms.txt" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="0" generateNumberParts="1" protected="protwords.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="0"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" protected="protwords.txt" language="English"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
You can learn more here.
Enabling spellcheck in Apache Solr is a useful feature that allows you to provide suggestions for misspelled or incorrect search queries. To enable spellcheck in Solr, you need to configure your Solr schema, Solr configuration files, and query parameters. Here's a step-by-step guide on how to do it:
schema.xml
) located in your Solr core's conf
directory.TextField
type, for example:
<fieldType name="textSpell" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<field name="content" type="textSpell" indexed="true" stored="true"/>
<field name="spell" type="textSpell" indexed="true" stored="false" multiValued="true"/>
solrconfig.xml
) located in your Solr core's conf
directory.<requestHandler>
configuration section for your search endpoint (e.g., /select
) and add the spellcheck component to it. You should also configure other parameters as needed.
<requestHandler name="/select" class="solr.SearchHandler">
<!-- ... -->
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
solrconfig.xml
file, configure the spellcheck component. You can define its settings under the <searchComponent>
section.
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<lst name="spellchecker">
<str name="name">default</str>
<str name="field">spell</str>
<str name="classname">solr.DirectSolrSpellChecker</str>
<str name="distanceMeasure">internal</str>
<float name="accuracy">0.5</float>
<int name="maxEdits">2</int>
<int name="minPrefix">1</int>
<int name="maxInspections">5</int>
<int name="minQueryLength">3</int>
<float name="maxQueryFrequency">0.5</float>
</lst>
</searchComponent>
spellcheck
parameter to your query:
/select?q=your_query&spellcheck=true
spellcheck
section.By following these steps, you should be able to enable spellcheck in Apache Solr and provide search query suggestions for misspelled terms. Make sure to adjust the configuration parameters according to your specific use case and requirements.
Please be advised that, your Opensolr Index may fail to reload, when using AnalyzingInfixSuggester
It turns out, that Drupal, is exporting the Solr Configuration zip archive erroneously.
Basically, you will need to manually edit solrconfig_extra.xml, in order to explicitly specify a separate folder for each suggester dictionary.
You can click here to learn more, from the Bug reported to the Drupal Community.
There is often the case (as it is with drupal), that your config files will contain files like schema_extra.xml, or solrconfig_extra.xml
In this case, your main schema.xml and solrconfig.xml will contain references to various fields and types that are defined in those extra files.
Therefore, you need to split your config files archive into multiple archives, and upload them as follows:
- First upload the extra files (zip up the schema_extra.xml and other *_extra.xml files and upload that zip first)
- Second upload the main schema.xml file, along with all other resource files, such as stopwords.txt, synonyms.txt, etc.
- Third, upload a zip archive that contains solrconfig.xml alone.
Solr works with a set of multiple configuration files.
Each Solr configuration file, has it's own purpose.
Therefore, in some cases, some publishers (CMS systems, etc), will chose to create their own structure for such Solr configuration files, such as, it is the case with Drupal, and maybe WordPress (WPSOLR), and others.
When uploading your solr configuration files, using your Opensolr Index Control Panel, it is, therefore, important to upload your files in a specific order:
So, basically, you should simply create those 3 archives and upload them separately, in this exact order, and then everything should work.
You can, of course automate this, by using the Automation REST API to upload your config files.
If you get the error:
Undefined field _text_
Please make sure to open up solrconfig.xml in your Opensolr Control Panel Admin UI and remove the reference to the _text_ field under the /update initParams:
<initparams path="/update/**,/query,/select,/tvrh,/elevate,/spell,/browse">
<lst name="defaults"></lst>
<str name="df">_text_</str>
</initparams>
YES, however, it's only active in some servers right now.
Please ask us to install that, or any other plugin solr library, by following the guide here, and we'll be happy to set it up for you.
Sometimes, in the shared opensolr cloud, the data folder may get corrupted, so it can't be read from or written into.
One easy fix for this, is to simply remove your index, and then just create another one, preferably under another name.
If that doesn't work, please contact us, and we'll be happy to fix it up for you.
Also, keep in mind, that there may be more reasons, so please make sure to check your error log, by clicking the Error Log button inside your opensolr index control panel, and keep refreshing that page to make sure the errors you'll see are accurate.
If you do see errors in there, please email them to us, at support@opensolr.com and we'll fix it for you.
To move from using the managed-schema to schema.xml, simply follow the steps below:
In your solrconfig.xml file, look for the schemaFactory definition.If you have one, remove it and add this instead:
<schemaFactory class="ClassicIndexSchemaFactory"/>
If you don't have it just add the above snippet somewhere above the requestHandlers definitions.
To move from using the classic schema.xml in your opensolr index, to the managed-schema simply follow the steps below:
In your solrconfig.xml, look for a SchemaFactory definition, and replace it with this snippet:
<schemaFactory class="ManagedIndexSchemaFactory"> <bool name="mutable">true</bool> <str name="managedSchemaResourceName">managed-schema</str> </schemaFactory>
If you don't have any schemaFactory definition, just paste the above snippet to your solrconfig.xml file, just about any requestHandler definition.
If you usually get an error, such as: Unknown field... Or Missing field, and your schema.xml already contains those fields, make sure you disable the Schemaless mode in solrconfig.xml
Just head on to the Config Files Editor in your opensolr index control panel, and locate a snippet that looks like this:
class="ManagedIndexSchemaFactory"
According to the solr documentation, you can disable the ManagedIndexSchemaFactory as per the instructions below:
To disable dynamic schema REST APIs, use the following for: <schemaFactory class="ClassicIndexSchemaFactory"/>
Also do not forget to remove the entire snippet regarding the ManagedIndexSchemaFactory, so that you won't accidentally use both.
Yes, Opensolr now supports the JTS Topology Suite, by default, which does not come bundled with the default Solr distribution.
It should be enabled in most of our servers and datacenters, however, if you feel that doesn't work for your index, please Contact Support and we'll be happy to enable it for you.
No further setup will be required on your part.
Please go to https://opensolr.com/pricing and make sure you select the analytics option from the extra features tab, when you upgrade your account.
If you can see analytics but no data, make sure your solr queries are correctly formated in the form:
https://server.opensolr.com/solr/index_name/select?q=your_query&other_params...
So, the search query must be clearly visible in the q parameter in order for it to show in analytics.
Here are a few ways to save your Monthy alloted Bandwidth:
There are a couple things you might be able to do to trade performance for index size. For example, an integer (int) field uses less space than a trie integer (tint), but range queries will be slower when using an int.
To make major reductions in your index, you will almost certainly need to look more closely at the fields you are using.
EZcmd.com is a useful set of GeoData and GeoIP utilities.
Here are a few screenshots