Documentation

Select a category on the left, to get your answers quickly

What is the Opensolr Traffic Bandwidth Limit:
The Traffic Bandwidth Limit, is there, instead of limiting your usage, by the number of requests.
At Opensolr, we recognize that there may be those use cases where you could have a lot of smaller requests, and it's not fair to bill and charge for the number of those requests, but rather, this will charge you (on a pre-paid plan), on the number of outgoing bytes that are being sent from your Opensolr Index, to your website or application. Therefore, 1 Gb of Traffic could mean 1 million requests (if you know how to optimize your requests), or it could be 1 request (if you don't know how to optimize your requests).
If you are wondering about your Search Traffic Bandwidth Usage, it is most likely, that a Bot, or even an attack has reached the search pages throughout your website, and this way, they will pass that traffic on to our servers here at Opensolr.

Solution:
First, read here, to learn about a few ways to save traffic bandwidth.
Also, read our Best Practices Guide, which may also help with saving traffic bandwidth.
Opensolr, transparently logs every single request made to our Solr infrastructure.
This means, that you get full access to see all requests, either via our Automation API, or via the Opensolr Index Control Panel.

Examples:

  1. API - Logs & Analytics:
    • This will give you the option to get and facet your requests by various fields.
    • Any Solr supported paramter is accepted while using our API, as it is also  described here.
    • Below is an example on how to facet all results by IP and path. You can learn more here.
  2. Index Control Panel Analytics.
    • This will show you a few useful metrics where you can learn more about the traffic spikes, popular queries, and more. You can find some examples below:
  3. Tail the logs. This can be done via your Index Control Panel, by selecing Solr Request Log or Bandwidth Request Log.
    • That is a tail of the last 1000 lines from your live request log. This will help you identify the live traffic in your website, and also helps you take necessary action on your web server, in order to mitigate any bottlenecks or reduce unwanted traffic. 
    • Examples follow below:

Importing data from XML into Opensolr

If you were using Solr's DataImport Handler, starting with Solr 9.x that is no longer possible.
Here's how to write a small script that will import data into your Opensolr Index, from XML files:

#!/bin/bash
USERNAME="<OPENSOLR_INDEX_HTTP_AUTH_USERNAME>"
PASSWORD="<OPENSOLR_INDEX_HTTP_AUTH_PASSWORD>"

echo "Starting import on all indexes..."
echo ""

echo "Importing: <YOUR_OPENSOLR_INDEX_NAME>"
echo "Downloading the xml data file"
wget -q <URL_TO_YOUR_XML_FILE>/<YOUR_XML_FILE_NAME>
echo "Removing all data"
curl -s -u $USERNAME:$PASSWORD "https://<YOUR_OPENSOLR_INDEX_HOSTNAME>/solr/<YOUR_OPENSOLR_INDEX_NAME>/update?commit=true&wt=json&indent=true" -H "Content-Type: text/xml" -d "*:*"
echo ""
echo "Uploading and Importing all data into <YOUR_OPENSOLR_INDEX_NAME>"
curl -u $USERNAME:$PASSWORD "https://<YOUR_OPENSOLR_INDEX_HOSTNAME>/solr/<YOUR_OPENSOLR_INDEX_NAME>/update?commit=true&wt=json&indent=true" --progress-bar -H "Content-Type: text/xml" --data-binary @<YOUR_XML_FILE_NAME> | tee -a "/dev/null" ; test ${PIPESTATUS[0]} -eq 0
echo ""
rm -rf <YOUR_XML_FILE_NAME>
echo "Done!"
echo ""
echo ""
echo ""

Now, the way this is made, is that if you have a minimal tech background, you can understand that everything within the <> brackets will have to be replaced with your Opensolr Index Name, your Opensolr Index Hostname, the URL for your XML file, and so forth. You can get all that info in your Opensolr Index Control Panel. Except for the URL to your XML file, which that is hosted somewhere on your end.

The way you format your XML file, is the classic Solr format.
This article may should show you more about the Solr XML Data File format.

Solr will often use quite a lot of RAM Memory, in order to build the search results response.

Therefore it is important that we follow a few Best Practices, in order to ensure that we do not overuse any resources that would not otherwise be needed.

It often happens that a dedicated Opensolr Environment, with quite an extensive amount of RAM Memory, won't be able to handle rather small Solr Indexes, because of the wrong implementation of certain Solr schema.xml configuration parameters, and other requests that will cause Solr to be killed by the Solr OOM script, when the Environment runs out of memory.

Also, Opensolr has a self-healing processes that will kick in for any crashed Solr process, recovering the Solr service in just under 1 minute.

Here are some Best Practices that you can use, to mitigate these effects:

  1. We have defined here, a few methods in which you can Save Transfer Bandwidth, which would in turn, also help with the Memory Management as well.
  2. Are you using Solr to return a very large number of documents in one request?
    1. That will cause Solr to allocate memory for all the data in your index, and then keep that cached, and reallocate new memory for each new query. 
    2. The solution is to keep the &rows parameter to a value under 100, (&rows=100) as much as possible, and not tell Solr to return more data than it is necessary in each request.
  3. Are you paginating over a very large number of pges?
    1. Requesting Solr to return documents starting from a very high offset, will again cause it to allocate memory for all the data in the index, and have that cached in, which will, again, quickly exhaust the Envionment's RAM Memory.
    2. The solution is to try to only paginate to a reasonable amount of pages, as much as possible. In other words, keep the start parameter below 50,000 (&start=50000&rows=100)
    3. This also depends on the number of fields you have stored vs indexed fields, as, the more stored fields you have defined in your schema.xml the more RAM will be used for such high &start parameter, since Solr will have to allocate more memory for each field data.
  4. Are you using heavy FacettingSortingHighlighting, or Group Queries?
    1. The solution is that all these, as a best practice, should be done on docValues=true fields.
    2. That is to say, that, in your schema.xml you should define the fields used for Facetting Sorting and Grouping, as docValues=true
      1. Example: 
        <field name="name" docValues="true" type="text_general" indexed="true" stored="true" />
    3. In some cases, defining more parameteres will be useful, especially when using highlighting on certain fields.
      1. Example: 
        <field name="description" type="text_general" indexed="true" stored="true" docValues="true" termVectors="true" termPositions="true" termOffsets="true" storeOffsetsWithPositions="true" />
  5. Field, Filter, Query and Document Caching
    1. In many cases, using such caching on your Solr index, will do more harm than good since those caches won't always be hit or the hit ratio will be very low, in which case memory will be filled with useless caches.
    2. The solrconfig.xml file, has multiple caching configurations that can be tweaked in order to ensure that caching is not overused:
      1. filterCache - storing unordered lists of document ids that have been returned by the “fq” (filterQuery) parameter of your queries.
      2. queryResultCache - stores document ids returned by searches
      3. documentCache - caches fieldValues that have been defined as “stored” in the schema, so that Solr does not have to go back to the index to fetch and return them for display.
      4. fieldCache - used to store all of the values for a field in memory rather than on to disk. For a large index, the fieldCache can occupy a lot of memory, especially if caching many fields.
    3. The solution is finding the definitions of these caches in your solrconfig.xml and setting them to a value as low as possible. 
      1. Example:
        <filterCache size="1" initialSize="1" autowarmCount="0"/>
  6. If using Drupal, make sure you update your Search API Solr module to the latest version, in order to fix this known bug.

 

Enable Spellcheck In Solr

Enabling spellcheck in Apache Solr is a useful feature that allows you to provide suggestions for misspelled or incorrect search queries. To enable spellcheck in Solr, you need to configure your Solr schema, Solr configuration files, and query parameters. Here's a step-by-step guide on how to do it:

  1. Schema Configuration:
    1. Open your Solr schema configuration file (usually named schema.xml) located in your Solr core's conf directory.
    2. Add a field type that specifies how you want to handle text for spellchecking. You can use the TextField type, for example:
      1. <fieldType name="textSpell" class="solr.TextField" positionIncrementGap="100">
        <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        </analyzer>
        <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        </analyzer>
        </fieldType>
    3. Define a new field that uses this field type for your spellcheck suggestions. This field should be used for indexing your content.
      1. <field name="content" type="textSpell" indexed="true" stored="true"/>
    4. Add a new field for the spellcheck dictionary, where Solr will store its spellcheck suggestions.
      1. <field name="spell" type="textSpell" indexed="true" stored="false" multiValued="true"/>
  2. Solr Configuration:
    1. Open your Solr configuration file (usually named solrconfig.xml) located in your Solr core's conf directory.
    2. Find the <requestHandler> configuration section for your search endpoint (e.g., /select) and add the spellcheck component to it. You should also configure other parameters as needed.
      1. <requestHandler name="/select" class="solr.SearchHandler"> 
        <!-- ... -->
        <arr name="last-components">
        <str>spellcheck</str>
        </arr>
        </requestHandler>
  3. Spellcheck Component Configuration:
    1. Still in the solrconfig.xml file, configure the spellcheck component. You can define its settings under the <searchComponent> section.
      1. <searchComponent name="spellcheck" class="solr.SpellCheckComponent">
        <lst name="spellchecker">
        <str name="name">default</str>
        <str name="field">spell</str>
        <str name="classname">solr.DirectSolrSpellChecker</str>
        <str name="distanceMeasure">internal</str>
        <float name="accuracy">0.5</float>
        <int name="maxEdits">2</int>
        <int name="minPrefix">1</int>
        <int name="maxInspections">5</int>
        <int name="minQueryLength">3</int>
        <float name="maxQueryFrequency">0.5</float>
        </lst>
        </searchComponent>
  4. Reindex Data:
    1. After making these schema and configuration changes, you need to reindex your data.
  5. Querying with Spellcheck:
    1. When making a search query to Solr, you can enable spellcheck suggestions by adding the spellcheck parameter to your query:
      1. /select?q=your_query&spellcheck=true
    2. Solr will return spellcheck suggestions in the response, typically under the spellcheck section.

By following these steps, you should be able to enable spellcheck in Apache Solr and provide search query suggestions for misspelled terms. Make sure to adjust the configuration parameters according to your specific use case and requirements.

 

Using the NLP Models in your schema_extra_types.xml:
<fieldType name="text_edge_nouns_nl" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="nl-sent.bin" tokenizerModel="nl-token.bin"/>
    <filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="nl-pos-maxent.bin"/>
    <filter class="solr.TypeTokenFilterFactory" types="pos_edge_nouns_nl.txt" useWhitelist="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
    <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="25"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="nl-sent.bin" tokenizerModel="nl-token.bin"/>
    <filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="nl-pos-maxent.bin"/>
    <filter class="solr.TypeTokenFilterFactory" types="pos_edge_nouns_nl.txt" useWhitelist="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms_edge_nouns_nl.txt"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>
To reference the NLP models correctly, you need to enter the full path.
For example, instead of 
sentenceModel="nl-sent.bin"
You would enter:
sentenceModel="/opt/nlp/nl-sent.bin"
Since that's where the nlp models reside, in Opensolr.
So here is the correct schema_extra_types.xml file:
<!--
  Dutch Edge NGram Nouns Field
  7.0.0
-->
<fieldType name="text_edge_nouns_nl" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/nl-sent.bin" tokenizerModel="nl-token.bin"/>
    <filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/nl-pos-maxent.bin"/>
    <filter class="solr.TypeTokenFilterFactory" types="pos_edge_nouns_nl.txt" useWhitelist="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
    <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="25"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/nl-sent.bin" tokenizerModel="/opt/nlp/nl-token.bin"/>
    <filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/nl-pos-maxent.bin"/>
    <filter class="solr.TypeTokenFilterFactory" types="pos_edge_nouns_nl.txt" useWhitelist="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms_edge_nouns_nl.txt"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>
<!--
  Dutch Nouns Field
  7.0.0
-->
<fieldType name="text_nouns_nl" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/nl-sent.bin" tokenizerModel="/opt/nlp/nl-token.bin"/>
    <filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/nl-pos-maxent.bin"/>
    <filter class="solr.TypeTokenFilterFactory" types="pos_nouns_nl.txt" useWhitelist="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/nl-sent.bin" tokenizerModel="/opt/nlp/nl-token.bin"/>
    <filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/nl-pos-maxent.bin"/>
    <filter class="solr.TypeTokenFilterFactory" types="pos_nouns_nl.txt" useWhitelist="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms_nouns_nl.txt"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>
<!--
  Dutch Text Field
  7.0.0
-->
<fieldType name="text_nl" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <charFilter class="solr.MappingCharFilterFactory" mapping="accents_nl.txt"/>
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" protected="protwords_nl.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="1"/>
    <filter class="solr.LengthFilterFactory" min="2" max="100"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.DictionaryCompoundWordTokenFilterFactory" dictionary="nouns_nl.txt" minWordSize="5" minSubwordSize="4" maxSubwordSize="15" onlyLongestMatch="false"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_nl.txt"/>
    <filter class="solr.SnowballPorterFilterFactory" protected="protwords_nl.txt" language="Kp"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <charFilter class="solr.MappingCharFilterFactory" mapping="accents_nl.txt"/>
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="0" generateNumberParts="1" protected="protwords_nl.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="0"/>
    <filter class="solr.LengthFilterFactory" min="2" max="100"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SynonymGraphFilterFactory" ignoreCase="true" synonyms="synonyms_nl.txt" expand="true"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_nl.txt"/>
    <filter class="solr.SnowballPorterFilterFactory" protected="protwords_nl.txt" language="Kp"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>
<!--
  Dutch Text Field collated
  7.0.0
-->
<fieldType name="collated_nl" class="solr.ICUCollationField" locale="nl" strength="primary" caseLevel="false"/>
<!--
  Dutch Text Field unstemmed
  7.0.0
-->
<fieldType name="text_unstemmed_nl" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <charFilter class="solr.MappingCharFilterFactory" mapping="accents_nl.txt"/>
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" protected="protwords_nl.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="1"/>
    <filter class="solr.LengthFilterFactory" min="2" max="100"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.DictionaryCompoundWordTokenFilterFactory" dictionary="nouns_nl.txt" minWordSize="5" minSubwordSize="4" maxSubwordSize="15" onlyLongestMatch="false"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_nl.txt"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <charFilter class="solr.MappingCharFilterFactory" mapping="accents_nl.txt"/>
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="0" generateNumberParts="1" protected="protwords_nl.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="0"/>
    <filter class="solr.LengthFilterFactory" min="2" max="100"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SynonymGraphFilterFactory" ignoreCase="true" synonyms="synonyms_nl.txt" expand="true"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_nl.txt"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>
<!--
  Edge NGram String Field
  6.0.0
-->
<fieldType name="text_edgenstring" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
  <analyzer type="index">
    <tokenizer class="solr.KeywordTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
    <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="25"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.KeywordTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>
<!--
  Edge NGram Text Field
  7.0.0
-->
<fieldType name="text_edge" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
  <analyzer type="index">
    <charFilter class="solr.MappingCharFilterFactory" mapping="accents_und.txt"/>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_und.txt"/>
    <filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" protected="protwords_und.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="1"/>
    <filter class="solr.FlattenGraphFilterFactory"/>
    <filter class="solr.LengthFilterFactory" min="2" max="100"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
    <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="25"/>
  </analyzer>
  <analyzer type="query">
    <charFilter class="solr.MappingCharFilterFactory" mapping="accents_und.txt"/>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.LengthFilterFactory" min="2" max="100"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>
<!--
  English Nouns Field
  7.0.0
-->
<fieldType name="text_nouns_en" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/en-sent.bin" tokenizerModel="/opt/nlp/en-token.bin"/>
    <filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/en-pos-maxent.bin"/>
    <filter class="solr.TypeTokenFilterFactory" types="pos_nouns_en.txt" useWhitelist="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/en-sent.bin" tokenizerModel="/opt/nlp/en-token.bin"/>
    <filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/en-pos-maxent.bin"/>
    <filter class="solr.TypeTokenFilterFactory" types="pos_nouns_en.txt" useWhitelist="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms_nouns_en.txt"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>
<!--
  English Text Field
  7.0.0
-->
<fieldType name="text_en" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
  <analyzer type="index">
    <charFilter class="solr.MappingCharFilterFactory" mapping="accents_en.txt"/>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_en.txt"/>
    <filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" protected="protwords_en.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="1"/>
    <filter class="solr.FlattenGraphFilterFactory"/>
    <filter class="solr.LengthFilterFactory" min="2" max="100"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SnowballPorterFilterFactory" protected="protwords_en.txt" language="English"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <charFilter class="solr.MappingCharFilterFactory" mapping="accents_en.txt"/>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.SynonymGraphFilterFactory" ignoreCase="true" synonyms="synonyms_en.txt" expand="true"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_en.txt"/>
    <filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="0" generateNumberParts="1" protected="protwords_en.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="0"/>
    <filter class="solr.LengthFilterFactory" min="2" max="100"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SnowballPorterFilterFactory" protected="protwords_en.txt" language="English"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>
<!--
  English Text Field collated
  7.0.0
-->
<fieldType name="collated_en" class="solr.ICUCollationField" locale="en" strength="primary" caseLevel="false"/>
<!--
  English Text Field unstemmed
  7.0.0
-->
<fieldType name="text_unstemmed_en" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
  <analyzer type="index">
    <charFilter class="solr.MappingCharFilterFactory" mapping="accents_en.txt"/>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_en.txt"/>
    <filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" protected="protwords_en.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="1"/>
    <filter class="solr.FlattenGraphFilterFactory"/>
    <filter class="solr.LengthFilterFactory" min="2" max="100"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <charFilter class="solr.MappingCharFilterFactory" mapping="accents_en.txt"/>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.SynonymGraphFilterFactory" ignoreCase="true" synonyms="synonyms_en.txt" expand="true"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_en.txt"/>
    <filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="0" generateNumberParts="1" protected="protwords_en.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="0"/>
    <filter class="solr.LengthFilterFactory" min="2" max="100"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>
<!--
  Fulltext Phonetic
  7.0.0
-->
<fieldType name="text_phonetic_und" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
  <analyzer type="index">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_und.txt"/>
    <filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" protected="protwords_und.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="1"/>
    <filter class="solr.FlattenGraphFilterFactory"/>
    <filter class="solr.LengthFilterFactory" min="2" max="100"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.BeiderMorseFilterFactory" languageSet="auto" nameType="GENERIC" ruleType="APPROX" concat="true"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_und.txt"/>
    <filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="0" generateNumberParts="1" protected="protwords_und.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="0"/>
    <filter class="solr.LengthFilterFactory" min="2" max="100"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.BeiderMorseFilterFactory" languageSet="auto" nameType="GENERIC" ruleType="APPROX" concat="true"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>
<!--
  Fulltext Phonetic English
  7.0.0
-->
<fieldType name="text_phonetic_en" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
  <analyzer type="index">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_en.txt"/>
    <filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" protected="protwords_en.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="1"/>
    <filter class="solr.FlattenGraphFilterFactory"/>
    <filter class="solr.LengthFilterFactory" min="2" max="100"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.BeiderMorseFilterFactory" languageSet="english" nameType="GENERIC" ruleType="APPROX" concat="true"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_en.txt"/>
    <filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="0" generateNumberParts="1" protected="protwords_en.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="0"/>
    <filter class="solr.LengthFilterFactory" min="2" max="100"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.BeiderMorseFilterFactory" languageSet="english" nameType="GENERIC" ruleType="APPROX" concat="true"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>
<!--
  Fulltext String Field
  6.0.0
-->
<fieldType name="text_string" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
  <analyzer type="index">
    <tokenizer class="solr.PatternTokenizerFactory" pattern="[\t\r\n]"/>
    <filters/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.PatternTokenizerFactory" pattern="[\t\r\n]"/>
    <filters/>
  </analyzer>
</fieldType>
<!--
  Language Undefined Edge NGram Nouns Field
  7.0.0
-->
<fieldType name="text_edge_nouns_und" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/en-sent.bin" tokenizerModel="/opt/nlp/en-token.bin"/>
    <filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/en-pos-maxent.bin"/>
    <filter class="solr.TypeTokenFilterFactory" types="pos_edge_nouns_und.txt" useWhitelist="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
    <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="25"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/en-sent.bin" tokenizerModel="/opt/nlp/en-token.bin"/>
    <filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/en-pos-maxent.bin"/>
    <filter class="solr.TypeTokenFilterFactory" types="pos_edge_nouns_und.txt" useWhitelist="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms_edge_nouns_und.txt"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>
<!--
  Language Undefined Nouns Field
  7.0.0
-->
<fieldType name="text_nouns_und" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/en-sent.bin" tokenizerModel="/opt/nlp/en-token.bin"/>
    <filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/en-pos-maxent.bin"/>
    <filter class="solr.TypeTokenFilterFactory" types="pos_nouns_und.txt" useWhitelist="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/en-sent.bin" tokenizerModel="/opt/nlp/en-token.bin"/>
    <filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/en-pos-maxent.bin"/>
    <filter class="solr.TypeTokenFilterFactory" types="pos_nouns_und.txt" useWhitelist="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms_nouns_und.txt"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>
<!--
  Language Undefined Text Field
  7.0.0
-->
<fieldType name="text_und" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
  <analyzer type="index">
    <charFilter class="solr.MappingCharFilterFactory" mapping="accents_und.txt"/>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_und.txt"/>
    <filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" protected="protwords_und.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="1"/>
    <filter class="solr.FlattenGraphFilterFactory"/>
    <filter class="solr.LengthFilterFactory" min="2" max="100"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <charFilter class="solr.MappingCharFilterFactory" mapping="accents_und.txt"/>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.SynonymGraphFilterFactory" ignoreCase="true" synonyms="synonyms_und.txt" expand="true"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_und.txt"/>
    <filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="0" generateNumberParts="1" protected="protwords_und.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="0"/>
    <filter class="solr.LengthFilterFactory" min="2" max="100"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>
<!--
  Language Undefined Text Field spellcheck
  7.0.0
-->
<fieldType name="text_spell_und" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
  <analyzer>
    <charFilter class="solr.MappingCharFilterFactory" mapping="accents_und.txt"/>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.LengthFilterFactory" min="2" max="100"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>
<!--
  Language Undefined Text Field collated
  7.0.0
-->
<fieldType name="collated_und" class="solr.ICUCollationField" locale="" strength="primary" caseLevel="false"/>
<!--
  NGram String Field
  6.0.0
-->
<fieldType name="text_ngramstring" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
  <analyzer type="index">
    <tokenizer class="solr.KeywordTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
    <filter class="solr.NGramFilterFactory" minGramSize="2" maxGramSize="25"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.KeywordTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>
<!--
  NGram Text Field
  7.0.0
-->
<fieldType name="text_ngram" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
  <analyzer type="index">
    <charFilter class="solr.MappingCharFilterFactory" mapping="accents_und.txt"/>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_und.txt"/>
    <filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" protected="protwords_und.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="1"/>
    <filter class="solr.FlattenGraphFilterFactory"/>
    <filter class="solr.LengthFilterFactory" min="2" max="100"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
    <filter class="solr.NGramFilterFactory" minGramSize="2" maxGramSize="25"/>
  </analyzer>
  <analyzer type="query">
    <charFilter class="solr.MappingCharFilterFactory" mapping="accents_und.txt"/>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.LengthFilterFactory" min="2" max="100"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>

Solr works with a set of multiple configuration files.
Each Solr configuration file, has it's own purpose.

Therefore, in some cases, some publishers (CMS systems, etc), will chose to create their own structure for such Solr configuration files, such as, it is the case with Drupal, and maybe WordPress (WPSOLR), and others.

When uploading your solr configuration files, using your Opensolr Index Control Panel, it is, therefore, important to upload your files in a specific order:

  1. Create and upload a .zip archive containing all your dependency config files such as .txt files, schema-extra.xml solrconfig-extra.xml, pretty much, everything except schema.xml and solrconfig.xml
  2. Create and upload a .zip archive containing your schema.xml file since that defines all fields and uses references to the archive you uploaded before (that contains schema-extra.xml and others like that)
  3. Create and upload a .zip archive containing your solrconfig.xml file, since this one will have references to field definitions inside your schema.xml and other dependency files.

So, basically, you should simply create those 3 archives and upload them separately, in this exact order, and then everything should work.
You can, of course automate this, by using the Automation REST API to upload your config files.

The auto phrase tokenfilter jar, is not active on all the Opensolr environments.
If you are having troubles using it, please contact us to request the installation of that library, or any other plugin solr library, by following the guide here, and we'll be happy to set it up for you.

If you keep getting redirected to the Login page, or you are having troubles with placing a new order, after trying to login multiple times, please try to clear the opensolr cookies, or use a different browser.

   

In order to use a custom jar Solr library, please send it to support@opensolr.com and our tech dept. will install it in your environment (you'll also need to specify your Opensolr Registration Email Address and the Opensolr Index Name), within about 24 hours (usually just takes a couple of hours if the plugin is fully compatible with your Solr version).

Make sure you send the .jar it's self, or links to official plugin page, where jar binaries are already compiled.

Click on the Tooks Menu Item on the right hand side, and then simply use the form to create your query and delete data.

To move from using the managed-schema to schema.xml, simply follow the steps below:

In your solrconfig.xml file, look for the schemaFactory definition.If you have one, remove it and add this instead:

<schemaFactory class="ClassicIndexSchemaFactory"/>

If you don't have it just add the above snippet somewhere above the requestHandlers definitions. 

 

To move from using the classic schema.xml in your opensolr index, to the managed-schema simply follow the steps below:

In your solrconfig.xml, look for a SchemaFactory definition, and replace it with this snippet:

   <schemaFactory class="ManagedIndexSchemaFactory">
      <bool name="mutable">true</bool>
      <str name="managedSchemaResourceName">managed-schema</str>
   </schemaFactory>

If you don't have any schemaFactory definition, just paste the above snippet to your solrconfig.xml file, just about any requestHandler definition.

Opensolr now supports any Solr Version that may be required by your project.

Solr Versions provided by Opensolr.com

 

Please go to https://opensolr.com/pricing and make sure you select the analytics option from the extra features tab, when you upgrade your account. 

If you can see analytics but no data, make sure your solr queries are correctly formated in the form:
https://server.opensolr.com/solr/index_name/select?q=your_query&other_params... 

So, the search query must be clearly visible in the q parameter in order for it to show in analytics. 

Here are a few ways to save your Monthy alloted Bandwidth:

  1. Using local caching, such as memcache, can greatly reduce the actual requests to make to our solr servers, thus saving you bandwidth
  2.  Since the allocated Bandwidth is PER INDEX, you could setup Solr Replication and setup your local application to perform round-robin requests in all of your replicas. This way, the bandwidth is saved by ballancing it between the ressources of multiple indexes. For example, if your account has 1 Gb PER INDEX, you will create Index A and replicate it onto Index B, and make requests in both of those in a round-robin fashion, thus gaining 2 Gb total of bandwidth for your index.
  3. Make sure your solr queries will return as little data as possible. For example use the rows or fl parameters for the solr /select requests, to only return the records and the fields you really need. Any other data the gets returned will be counted as extra bandwidth.

There are a couple things you might be able to do to trade performance for index size. For example, an integer (int) field uses less space than a trie integer (tint), but range queries will be slower when using an int.

To make major reductions in your index, you will almost certainly need to look more closely at the fields you are using.

  • Are you using a lot of stored fields? If so, try removing the stored fields from the index and query your database for the necessary data once you've got the results back from Solr.
  • Add omitNorms="true" to text fields that don't need length normalization
  • Add omitPositions="true" to text fields that don't require phrase matching
  • Special fields, like NGrams, can take up a lot of space
  • Are you removing stop words from text fields?

OpenSolr is a cloud-based search service that provides hosting, management, and support for Apache Solr, a popular open-source search platform. Apache Solr is widely used for its powerful full-text search, hit highlighting, faceted search, dynamic clustering, and rich document handling capabilities. Here are some key aspects of OpenSolr services:

  1. Managed Solr Hosting: OpenSolr takes care of the setup, maintenance, and management of Solr instances. This includes handling server infrastructure, Solr installation, and configuration.

  2. Scalability and Performance: OpenSolr provides scalable solutions, allowing users to increase or decrease resources based on their requirements. This ensures efficient handling of varying search demands and data volumes.

  3. Data Security and Backups: OpenSolr offers secure hosting environments with features like SSL encryption, data backups, and recovery options to protect against data loss and ensure data integrity.

  4. Customizable Search Indexes: Users can create and customize their search indexes according to their specific needs. This includes defining schemas, setting up data import handlers, and configuring search parameters.

  5. User-friendly Interface: OpenSolr typically provides an easy-to-use control panel, allowing users to manage their Solr instances without deep technical knowledge. This includes features for index management, monitoring, and analytics.

  6. Support and Consultation: OpenSolr offers support services, including technical assistance and consulting, to help users optimize their Solr implementations for better search performance and reliability.

  7. Integration and API Support: OpenSolr services often include APIs and integration options, enabling seamless integration with various applications and data sources.

  8. Global Data Centers: To ensure fast and reliable access, OpenSolr may host its services in multiple data centers across the globe.

OpenSolr is suitable for businesses and developers who require robust search capabilities but want to avoid the complexities of self-hosting and managing Solr servers. It's often chosen for its scalability, reliability, and ease of use, particularly in applications like e-commerce platforms, content management systems, and data-intensive websites.

 
 

EZcmd.com is a useful set of GeoData and GeoIP utilities.

Here are a few screenshots

GEOIP Tools GEOIP Tools GEOIP Tools GEOIP Tools




Review us on Google Business
Close Check out the Fully Automated Solr server setup for Drupal & WordPress.