Select a category on the left, to get your answers quickly
What is the Opensolr Traffic Bandwidth Limit:
The Traffic Bandwidth Limit, is there, instead of limiting your usage, by the number of requests.
At Opensolr, we recognize that there may be those use cases where you could have a lot of smaller requests, and it's not fair to bill and charge for the number of those requests, but rather, this will charge you (on a pre-paid plan), on the number of outgoing bytes that are being sent from your Opensolr Index, to your website or application. Therefore, 1 Gb of Traffic could mean 1 million requests (if you know how to optimize your requests), or it could be 1 request (if you don't know how to optimize your requests).
If you are wondering about your Search Traffic Bandwidth Usage, it is most likely, that a Bot, or even an attack has reached the search pages throughout your website, and this way, they will pass that traffic on to our servers here at Opensolr.
Solution:
First, read here, to learn about a few ways to save traffic bandwidth.
Also, read our Best Practices Guide, which may also help with saving traffic bandwidth.
Opensolr, transparently logs every single request made to our Solr infrastructure.
This means, that you get full access to see all requests, either via our Automation API, or via the Opensolr Index Control Panel.
Examples:
If you were using Solr's DataImport Handler, starting with Solr 9.x that is no longer possible.
Here's how to write a small script that will import data into your Opensolr Index, from XML files:
#!/bin/bash USERNAME="<OPENSOLR_INDEX_HTTP_AUTH_USERNAME>" PASSWORD="<OPENSOLR_INDEX_HTTP_AUTH_PASSWORD>" echo "Starting import on all indexes..." echo "" echo "Importing: <YOUR_OPENSOLR_INDEX_NAME>" echo "Downloading the xml data file" wget -q <URL_TO_YOUR_XML_FILE>/<YOUR_XML_FILE_NAME> echo "Removing all data" curl -s -u $USERNAME:$PASSWORD "https://<YOUR_OPENSOLR_INDEX_HOSTNAME>/solr/<YOUR_OPENSOLR_INDEX_NAME>/update?commit=true&wt=json&indent=true" -H "Content-Type: text/xml" -d "*:*" echo "" echo "Uploading and Importing all data into <YOUR_OPENSOLR_INDEX_NAME>" curl -u $USERNAME:$PASSWORD "https://<YOUR_OPENSOLR_INDEX_HOSTNAME>/solr/<YOUR_OPENSOLR_INDEX_NAME>/update?commit=true&wt=json&indent=true" --progress-bar -H "Content-Type: text/xml" --data-binary @<YOUR_XML_FILE_NAME> | tee -a "/dev/null" ; test ${PIPESTATUS[0]} -eq 0 echo "" rm -rf <YOUR_XML_FILE_NAME> echo "Done!" echo "" echo "" echo ""
Now, the way this is made, is that if you have a minimal tech background, you can understand that everything within the <> brackets will have to be replaced with your Opensolr Index Name, your Opensolr Index Hostname, the URL for your XML file, and so forth. You can get all that info in your Opensolr Index Control Panel. Except for the URL to your XML file, which that is hosted somewhere on your end.
The way you format your XML file, is the classic Solr format.
This article may should show you more about the Solr XML Data File format.
Solr will often use quite a lot of RAM Memory, in order to build the search results response.
Therefore it is important that we follow a few Best Practices, in order to ensure that we do not overuse any resources that would not otherwise be needed.
It often happens that a dedicated Opensolr Environment, with quite an extensive amount of RAM Memory, won't be able to handle rather small Solr Indexes, because of the wrong implementation of certain Solr schema.xml configuration parameters, and other requests that will cause Solr to be killed by the Solr OOM script, when the Environment runs out of memory.
Also, Opensolr has a self-healing processes that will kick in for any crashed Solr process, recovering the Solr service in just under 1 minute.
Here are some Best Practices that you can use, to mitigate these effects:
<field name="name" docValues="true" type="text_general" indexed="true" stored="true" />
<field name="description" type="text_general" indexed="true" stored="true" docValues="true" termVectors="true" termPositions="true" termOffsets="true" storeOffsetsWithPositions="true" />
<filterCache size="1" initialSize="1" autowarmCount="0"/>
Enabling spellcheck in Apache Solr is a useful feature that allows you to provide suggestions for misspelled or incorrect search queries. To enable spellcheck in Solr, you need to configure your Solr schema, Solr configuration files, and query parameters. Here's a step-by-step guide on how to do it:
schema.xml
) located in your Solr core's conf
directory.TextField
type, for example:
<fieldType name="textSpell" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<field name="content" type="textSpell" indexed="true" stored="true"/>
<field name="spell" type="textSpell" indexed="true" stored="false" multiValued="true"/>
solrconfig.xml
) located in your Solr core's conf
directory.<requestHandler>
configuration section for your search endpoint (e.g., /select
) and add the spellcheck component to it. You should also configure other parameters as needed.
<requestHandler name="/select" class="solr.SearchHandler">
<!-- ... -->
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
solrconfig.xml
file, configure the spellcheck component. You can define its settings under the <searchComponent>
section.
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<lst name="spellchecker">
<str name="name">default</str>
<str name="field">spell</str>
<str name="classname">solr.DirectSolrSpellChecker</str>
<str name="distanceMeasure">internal</str>
<float name="accuracy">0.5</float>
<int name="maxEdits">2</int>
<int name="minPrefix">1</int>
<int name="maxInspections">5</int>
<int name="minQueryLength">3</int>
<float name="maxQueryFrequency">0.5</float>
</lst>
</searchComponent>
spellcheck
parameter to your query:
/select?q=your_query&spellcheck=true
spellcheck
section.By following these steps, you should be able to enable spellcheck in Apache Solr and provide search query suggestions for misspelled terms. Make sure to adjust the configuration parameters according to your specific use case and requirements.
<fieldType name="text_edge_nouns_nl" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="nl-sent.bin" tokenizerModel="nl-token.bin"/>
<filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="nl-pos-maxent.bin"/>
<filter class="solr.TypeTokenFilterFactory" types="pos_edge_nouns_nl.txt" useWhitelist="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="25"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="nl-sent.bin" tokenizerModel="nl-token.bin"/>
<filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="nl-pos-maxent.bin"/>
<filter class="solr.TypeTokenFilterFactory" types="pos_edge_nouns_nl.txt" useWhitelist="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms_edge_nouns_nl.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
sentenceModel="nl-sent.bin"
sentenceModel="/opt/nlp/nl-sent.bin"
<!--
Dutch Edge NGram Nouns Field
7.0.0
-->
<fieldType name="text_edge_nouns_nl" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/nl-sent.bin" tokenizerModel="nl-token.bin"/>
<filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/nl-pos-maxent.bin"/>
<filter class="solr.TypeTokenFilterFactory" types="pos_edge_nouns_nl.txt" useWhitelist="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="25"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/nl-sent.bin" tokenizerModel="/opt/nlp/nl-token.bin"/>
<filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/nl-pos-maxent.bin"/>
<filter class="solr.TypeTokenFilterFactory" types="pos_edge_nouns_nl.txt" useWhitelist="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms_edge_nouns_nl.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<!--
Dutch Nouns Field
7.0.0
-->
<fieldType name="text_nouns_nl" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/nl-sent.bin" tokenizerModel="/opt/nlp/nl-token.bin"/>
<filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/nl-pos-maxent.bin"/>
<filter class="solr.TypeTokenFilterFactory" types="pos_nouns_nl.txt" useWhitelist="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/nl-sent.bin" tokenizerModel="/opt/nlp/nl-token.bin"/>
<filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/nl-pos-maxent.bin"/>
<filter class="solr.TypeTokenFilterFactory" types="pos_nouns_nl.txt" useWhitelist="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms_nouns_nl.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<!--
Dutch Text Field
7.0.0
-->
<fieldType name="text_nl" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.MappingCharFilterFactory" mapping="accents_nl.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" protected="protwords_nl.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="1"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.DictionaryCompoundWordTokenFilterFactory" dictionary="nouns_nl.txt" minWordSize="5" minSubwordSize="4" maxSubwordSize="15" onlyLongestMatch="false"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_nl.txt"/>
<filter class="solr.SnowballPorterFilterFactory" protected="protwords_nl.txt" language="Kp"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.MappingCharFilterFactory" mapping="accents_nl.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="0" generateNumberParts="1" protected="protwords_nl.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="0"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SynonymGraphFilterFactory" ignoreCase="true" synonyms="synonyms_nl.txt" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_nl.txt"/>
<filter class="solr.SnowballPorterFilterFactory" protected="protwords_nl.txt" language="Kp"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<!--
Dutch Text Field collated
7.0.0
-->
<fieldType name="collated_nl" class="solr.ICUCollationField" locale="nl" strength="primary" caseLevel="false"/>
<!--
Dutch Text Field unstemmed
7.0.0
-->
<fieldType name="text_unstemmed_nl" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.MappingCharFilterFactory" mapping="accents_nl.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" protected="protwords_nl.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="1"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.DictionaryCompoundWordTokenFilterFactory" dictionary="nouns_nl.txt" minWordSize="5" minSubwordSize="4" maxSubwordSize="15" onlyLongestMatch="false"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_nl.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.MappingCharFilterFactory" mapping="accents_nl.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="0" generateNumberParts="1" protected="protwords_nl.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="0"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SynonymGraphFilterFactory" ignoreCase="true" synonyms="synonyms_nl.txt" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_nl.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<!--
Edge NGram String Field
6.0.0
-->
<fieldType name="text_edgenstring" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="25"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<!--
Edge NGram Text Field
7.0.0
-->
<fieldType name="text_edge" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
<analyzer type="index">
<charFilter class="solr.MappingCharFilterFactory" mapping="accents_und.txt"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_und.txt"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" protected="protwords_und.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="1"/>
<filter class="solr.FlattenGraphFilterFactory"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="25"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.MappingCharFilterFactory" mapping="accents_und.txt"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<!--
English Nouns Field
7.0.0
-->
<fieldType name="text_nouns_en" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/en-sent.bin" tokenizerModel="/opt/nlp/en-token.bin"/>
<filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/en-pos-maxent.bin"/>
<filter class="solr.TypeTokenFilterFactory" types="pos_nouns_en.txt" useWhitelist="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/en-sent.bin" tokenizerModel="/opt/nlp/en-token.bin"/>
<filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/en-pos-maxent.bin"/>
<filter class="solr.TypeTokenFilterFactory" types="pos_nouns_en.txt" useWhitelist="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms_nouns_en.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<!--
English Text Field
7.0.0
-->
<fieldType name="text_en" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
<analyzer type="index">
<charFilter class="solr.MappingCharFilterFactory" mapping="accents_en.txt"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_en.txt"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" protected="protwords_en.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="1"/>
<filter class="solr.FlattenGraphFilterFactory"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" protected="protwords_en.txt" language="English"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.MappingCharFilterFactory" mapping="accents_en.txt"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SynonymGraphFilterFactory" ignoreCase="true" synonyms="synonyms_en.txt" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_en.txt"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="0" generateNumberParts="1" protected="protwords_en.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="0"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" protected="protwords_en.txt" language="English"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<!--
English Text Field collated
7.0.0
-->
<fieldType name="collated_en" class="solr.ICUCollationField" locale="en" strength="primary" caseLevel="false"/>
<!--
English Text Field unstemmed
7.0.0
-->
<fieldType name="text_unstemmed_en" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
<analyzer type="index">
<charFilter class="solr.MappingCharFilterFactory" mapping="accents_en.txt"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_en.txt"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" protected="protwords_en.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="1"/>
<filter class="solr.FlattenGraphFilterFactory"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.MappingCharFilterFactory" mapping="accents_en.txt"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SynonymGraphFilterFactory" ignoreCase="true" synonyms="synonyms_en.txt" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_en.txt"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="0" generateNumberParts="1" protected="protwords_en.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="0"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<!--
Fulltext Phonetic
7.0.0
-->
<fieldType name="text_phonetic_und" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_und.txt"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" protected="protwords_und.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="1"/>
<filter class="solr.FlattenGraphFilterFactory"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.BeiderMorseFilterFactory" languageSet="auto" nameType="GENERIC" ruleType="APPROX" concat="true"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_und.txt"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="0" generateNumberParts="1" protected="protwords_und.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="0"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.BeiderMorseFilterFactory" languageSet="auto" nameType="GENERIC" ruleType="APPROX" concat="true"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<!--
Fulltext Phonetic English
7.0.0
-->
<fieldType name="text_phonetic_en" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_en.txt"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" protected="protwords_en.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="1"/>
<filter class="solr.FlattenGraphFilterFactory"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.BeiderMorseFilterFactory" languageSet="english" nameType="GENERIC" ruleType="APPROX" concat="true"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_en.txt"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="0" generateNumberParts="1" protected="protwords_en.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="0"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.BeiderMorseFilterFactory" languageSet="english" nameType="GENERIC" ruleType="APPROX" concat="true"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<!--
Fulltext String Field
6.0.0
-->
<fieldType name="text_string" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
<analyzer type="index">
<tokenizer class="solr.PatternTokenizerFactory" pattern="[\t\r\n]"/>
<filters/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.PatternTokenizerFactory" pattern="[\t\r\n]"/>
<filters/>
</analyzer>
</fieldType>
<!--
Language Undefined Edge NGram Nouns Field
7.0.0
-->
<fieldType name="text_edge_nouns_und" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/en-sent.bin" tokenizerModel="/opt/nlp/en-token.bin"/>
<filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/en-pos-maxent.bin"/>
<filter class="solr.TypeTokenFilterFactory" types="pos_edge_nouns_und.txt" useWhitelist="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="25"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/en-sent.bin" tokenizerModel="/opt/nlp/en-token.bin"/>
<filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/en-pos-maxent.bin"/>
<filter class="solr.TypeTokenFilterFactory" types="pos_edge_nouns_und.txt" useWhitelist="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms_edge_nouns_und.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<!--
Language Undefined Nouns Field
7.0.0
-->
<fieldType name="text_nouns_und" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/en-sent.bin" tokenizerModel="/opt/nlp/en-token.bin"/>
<filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/en-pos-maxent.bin"/>
<filter class="solr.TypeTokenFilterFactory" types="pos_nouns_und.txt" useWhitelist="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.OpenNLPTokenizerFactory" sentenceModel="/opt/nlp/en-sent.bin" tokenizerModel="/opt/nlp/en-token.bin"/>
<filter class="solr.OpenNLPPOSFilterFactory" posTaggerModel="/opt/nlp/en-pos-maxent.bin"/>
<filter class="solr.TypeTokenFilterFactory" types="pos_nouns_und.txt" useWhitelist="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms_nouns_und.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<!--
Language Undefined Text Field
7.0.0
-->
<fieldType name="text_und" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
<analyzer type="index">
<charFilter class="solr.MappingCharFilterFactory" mapping="accents_und.txt"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_und.txt"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" protected="protwords_und.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="1"/>
<filter class="solr.FlattenGraphFilterFactory"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.MappingCharFilterFactory" mapping="accents_und.txt"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SynonymGraphFilterFactory" ignoreCase="true" synonyms="synonyms_und.txt" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_und.txt"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="0" generateNumberParts="1" protected="protwords_und.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="0"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<!--
Language Undefined Text Field spellcheck
7.0.0
-->
<fieldType name="text_spell_und" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
<analyzer>
<charFilter class="solr.MappingCharFilterFactory" mapping="accents_und.txt"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<!--
Language Undefined Text Field collated
7.0.0
-->
<fieldType name="collated_und" class="solr.ICUCollationField" locale="" strength="primary" caseLevel="false"/>
<!--
NGram String Field
6.0.0
-->
<fieldType name="text_ngramstring" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
<filter class="solr.NGramFilterFactory" minGramSize="2" maxGramSize="25"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
<!--
NGram Text Field
7.0.0
-->
<fieldType name="text_ngram" class="solr.TextField" positionIncrementGap="100" storeOffsetsWithPositions="true">
<analyzer type="index">
<charFilter class="solr.MappingCharFilterFactory" mapping="accents_und.txt"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_und.txt"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" protected="protwords_und.txt" splitOnCaseChange="0" generateWordParts="1" preserveOriginal="1" catenateAll="0" catenateWords="1"/>
<filter class="solr.FlattenGraphFilterFactory"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
<filter class="solr.NGramFilterFactory" minGramSize="2" maxGramSize="25"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.MappingCharFilterFactory" mapping="accents_und.txt"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LengthFilterFactory" min="2" max="100"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
Solr works with a set of multiple configuration files.
Each Solr configuration file, has it's own purpose.
Therefore, in some cases, some publishers (CMS systems, etc), will chose to create their own structure for such Solr configuration files, such as, it is the case with Drupal, and maybe WordPress (WPSOLR), and others.
When uploading your solr configuration files, using your Opensolr Index Control Panel, it is, therefore, important to upload your files in a specific order:
So, basically, you should simply create those 3 archives and upload them separately, in this exact order, and then everything should work.
You can, of course automate this, by using the Automation REST API to upload your config files.
The auto phrase tokenfilter jar, is not active on all the Opensolr environments.
If you are having troubles using it, please contact us to request the installation of that library, or any other plugin solr library, by following the guide here, and we'll be happy to set it up for you.
To move from using the managed-schema to schema.xml, simply follow the steps below:
In your solrconfig.xml file, look for the schemaFactory definition.If you have one, remove it and add this instead:
<schemaFactory class="ClassicIndexSchemaFactory"/>
If you don't have it just add the above snippet somewhere above the requestHandlers definitions.
To move from using the classic schema.xml in your opensolr index, to the managed-schema simply follow the steps below:
In your solrconfig.xml, look for a SchemaFactory definition, and replace it with this snippet:
<schemaFactory class="ManagedIndexSchemaFactory"> <bool name="mutable">true</bool> <str name="managedSchemaResourceName">managed-schema</str> </schemaFactory>
If you don't have any schemaFactory definition, just paste the above snippet to your solrconfig.xml file, just about any requestHandler definition.
Please go to https://opensolr.com/pricing and make sure you select the analytics option from the extra features tab, when you upgrade your account.
If you can see analytics but no data, make sure your solr queries are correctly formated in the form:
https://server.opensolr.com/solr/index_name/select?q=your_query&other_params...
So, the search query must be clearly visible in the q parameter in order for it to show in analytics.
Here are a few ways to save your Monthy alloted Bandwidth:
There are a couple things you might be able to do to trade performance for index size. For example, an integer (int) field uses less space than a trie integer (tint), but range queries will be slower when using an int.
To make major reductions in your index, you will almost certainly need to look more closely at the fields you are using.
EZcmd.com is a useful set of GeoData and GeoIP utilities.
Here are a few screenshots