I found very little (if anything) out there with people trying to setup Solr 6 and Dovecot 2 so thought i’d post this up. Perhaps it will help someone else along the way.
What i’m running:-
- FreebSD 12
- Openjdk 8
- Dovecot 2 (installed from pkgs)
- Apache-solr 8 (downloaded direct from apache)
I’m not going to cover how to install and configure dovecot here but in short I installed it using “pkg install dovecot2” on FreeBSD and configured based on my needs in /usr/local/etc/dovecot/dovecot.conf along with the following entries to enable FTS and Solr
... mail_plugins = $mail_plugins quota fts fts_solr ... plugin { fts_autoindex=yes fts = solr fts_solr = url=http://localhost:8983/solr/dovecot/ } ... protocol imap { mail_plugins = quota imap_quota fts fts_solr } protocol lmtp { mail_plugins = quota fts fts_solr } ...
The above assumes that you’re running solr on localhost of course.
Now, in Solr version 5+ you no longer need tomcat or jetty installed separately to run solr so forget reading any articles that advise you to install these unless you’re running solr version 4 or older. Reason being, solr now ships with a copy of jetty and all the startup scripts you need.
- All you need to install is OpenJDK 8 JRE and make sure that your java version output is at least version 1.8 as below
# java -version openjdk version "1.8.0_92" OpenJDK Runtime Environment (build 1.8.0_92-b14) OpenJDK 64-Bit Server VM (build 25.92-b14, mixed mode)
Version 1.7 (OpenJDK 7) will not work with solr and you will find that solr will not start
pkg install openjdk8-jre-8 (try pkg search openjdk8 - you need JRE)
This will prompt you to install numerous java packages and other bits and bobs.
- Download a copy of apache solr and extract to a suitable location. The below is using Solr 6 but I’ll come on to the differences you need between 6 and 8.
mkdir /opt cd /opt fetch http://www-eu.apache.org/dist/lucene/solr/6.1.0/solr-6.1.0.tgz tar -xvzf solr-6.1.0.tgz mv solr-6.1.0 solr
- Solr should now just start without issue by running the below
bin/solr start
once started it will drop to the background
- Now solr should be running you can create a brand new collection for dovecot
bin/solr create -c dovecot -n dovecot
- If all went well you’ll have a new collection made, so lets navigate to the configuration folder
cd /opt/solr/server/solr/dovecot/conf/
- Once there you’ll need to create a file named schema.xml and load it with the following schema.Dovecot 2.2:-
<?xml version="1.0" encoding="UTF-8" ?> <!-- For fts-solr: This is the Solr schema file, place it into solr/conf/schema.xml. You may want to modify the tokenizers and filters. --> <schema name="dovecot" version="1.5"> <types> <!-- IMAP has 32bit unsigned ints but java ints are signed, so use longs --> <fieldType name="string" class="solr.StrField" /> <fieldType name="long" class="solr.TrieLongField" /> <fieldType name="boolean" class="solr.BoolField" /> <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.EnglishPossessiveFilterFactory"/> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> <filter class="solr.EnglishMinimalStemFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.EnglishPossessiveFilterFactory"/> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> <filter class="solr.EnglishMinimalStemFilterFactory"/> </analyzer> </fieldType> </types> <fields> <field name="id" type="string" indexed="true" stored="true" required="true" /> <field name="uid" type="long" indexed="true" stored="true" required="true" /> <field name="box" type="string" indexed="true" stored="true" required="true" /> <field name="user" type="string" indexed="true" stored="true" required="true" /> <field name="hdr" type="text" indexed="true" stored="false" /> <field name="body" type="text" indexed="true" stored="false" /> <field name="from" type="text" indexed="true" stored="false" /> <field name="to" type="text" indexed="true" stored="false" /> <field name="cc" type="text" indexed="true" stored="false" /> <field name="bcc" type="text" indexed="true" stored="false" /> <field name="subject" type="text" indexed="true" stored="false" /> <!-- Used by Solr internally: --> <field name="_version_" type="long" indexed="true" stored="true"/> </fields> <uniqueKey>id</uniqueKey> </schema>
Dovecot 2.3:-
<?xml version="1.0" encoding="UTF-8"?> <schema name="dovecot" version="2.0"> <fieldType name="string" class="solr.StrField" omitNorms="true" sortMissingLast="true"/> <fieldType name="long" class="solr.LongPointField" positionIncrementGap="0"/> <fieldType name="boolean" class="solr.BoolField" sortMissingLast="true"/> <fieldType name="text" class="solr.TextField" autoGeneratePhraseQueries="true" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/> <filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" splitOnCaseChange="1" generateWordParts="1" splitOnNumerics="1" catenateAll="1" catenateWords="1"/> <filter class="solr.FlattenGraphFilterFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> <filter class="solr.PorterStemFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.SynonymGraphFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/> <filter class="solr.FlattenGraphFilterFactory"/> <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/> <filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" generateNumberParts="1" splitOnCaseChange="1" generateWordParts="1" splitOnNumerics="1" catenateAll="1" catenateWords="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> <filter class="solr.PorterStemFilterFactory"/> </analyzer> </fieldType> <field name="id" type="string" indexed="true" required="true" stored="true"/> <field name="uid" type="long" indexed="true" required="true" stored="true"/> <field name="box" type="string" indexed="true" required="true" stored="true"/> <field name="user" type="string" indexed="true" required="true" stored="true"/> <field name="hdr" type="text" indexed="true" stored="false"/> <field name="body" type="text" indexed="true" stored="false"/> <field name="from" type="text" indexed="true" stored="false"/> <field name="to" type="text" indexed="true" stored="false"/> <field name="cc" type="text" indexed="true" stored="false"/> <field name="bcc" type="text" indexed="true" stored="false"/> <field name="subject" type="text" indexed="true" stored="false"/> <!-- Used by Solr internally: --> <field name="_version_" type="long" indexed="true" stored="true"/> <uniqueKey>id</uniqueKey> </schema>
- Now you need to delete the managed schema file that comes with solr 5+ as we will not be using a managed schema
rm managed-schema
- Next, open up the below file in a text editor, i.e.
vi solrconfig.xml
For Solr 5 and 6 completely remove this section
<processor class="solr.AddSchemaFieldsUpdateProcessorFactory"> <str name="defaultFieldType">strings</str> <lst name="typeMapping"> <str name="valueClass">java.lang.Boolean</str> <str name="fieldType">booleans</str> </lst> <lst name="typeMapping"> <str name="valueClass">java.util.Date</str> <str name="fieldType">tdates</str> </lst> <lst name="typeMapping"> <str name="valueClass">java.lang.Long</str> <str name="valueClass">java.lang.Integer</str> <str name="fieldType">tlongs</str> </lst> <lst name="typeMapping"> <str name="valueClass">java.lang.Number</str> <str name="fieldType">tdoubles</str> </lst> </processor>
For solr 8 you need to do the following:-
1. Remove the element add-schema-fields from <updateRequestProcessorChain’s processor attribute.
2. Completely remove the below section
<updateProcessor class="solr.AddSchemaFieldsUpdateProcessorFactory" name="add-schema-fields"> <lst name="typeMapping"> <str name="valueClass">java.lang.String</str> <str name="fieldType">text_general</str> <lst name="copyField"> <str name="dest">*_str</str> <int name="maxChars">256</int> </lst> <!-- Use as default mapping instead of defaultFieldType --> <bool name="default">true</bool> </lst> <lst name="typeMapping"> <str name="valueClass">java.lang.Boolean</str> <str name="fieldType">booleans</str> </lst> <lst name="typeMapping"> <str name="valueClass">java.util.Date</str> <str name="fieldType">pdates</str> </lst> <lst name="typeMapping"> <str name="valueClass">java.lang.Long</str> <str name="valueClass">java.lang.Integer</str> <str name="fieldType">plongs</str> </lst> <lst name="typeMapping"> <str name="valueClass">java.lang.Number</str> <str name="fieldType">pdoubles</str> </lst> </updateProcessor>
Now for all versions, in the same file find the line that reads the below
See http://wiki.apache.org/solr/GuessingFieldTypes-->
and add in the following just below it
<schemaFactory class="ClassicIndexSchemaFactory"> </schemaFactory>
- With a bit of luck that’s everything you need to do so go ahead and restart solr by issuing /opt/solr/bin/solr restart and head straight to the logs file and you should hopefully see the following.
tail -f /opt/solr/server/logs/solr.log ...... 2016-08-15 14:40:37.515 INFO (coreLoadExecutor-6-thread-1) [ x:dovecot] o.a.s.c.CoreContainer registering core: dovecot
If you see any errors about failing to register field types, double check over the steps above and make sure you didn’t miss anything or something managed to creep back in
- finally open up another terminal and tail the solr log file as above whilst running the following command in dovecot and you should have all your emails indexed into solr 🙂
doveadm index -A inbox
Good luck!
Trying to implement this now however hitting an issue, schema is in place and other edits seem ok however Solr is throwing an exception (below); Any thoughts? Some data appears to occasionally make it into the index
doveadm index -A inbox causes:
2016-08-23 16:47:35.925 ERROR (qtp1729199940-19) [ x:dovecot] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Bad contentType for search handler :text/xml request={q=box:ed16a125a860bc573e4500006ce5411f+user:”username”&fl=uid&sort=uid+desc&rows=1}
at org.apache.solr.request.json.RequestUtil.processParams(RequestUtil.java:72)
at org.apache.solr.util.SolrPluginUtils.setDefaults(SolrPluginUtils.java:176)
….
Solr 6.1.0 running in a docker container and Dovecot 2.2.13 running on Debian Jessie.
I have the same problem with Solr 6.3
Hi,
I actually think this is a problem in the client not the server.
According to this: https://github.com/solariumphp/solarium/issues/326
The GET request shoud not set any “Content-Type”. And this is an issue only on newer solr versions.
The link its actually for solr php client solarium, but I believe it’s the same problem.
btw: your indexing shoud be working, just not the search queries (which renders it useless).
If you find a soluction please post it, I’m on that same road as you!
Good luck
Regards
Tiago
bug already patched:
https://bugzilla.redhat.com/show_bug.cgi?id=1360774
altought current centos 7 base repo version of dovecot is still 2.2.10 (I use centos)
sudo yum list installed | grep dovecot
dovecot.x86_64 1:2.2.10-7.el7 @base
dovecot-mysql.x86_64 1:2.2.10-7.el7 @base
dovecot-pigeonhole.x86_64 1:2.2.10-7.el7 @base
according to devs was rolled out on 2.2.19
I first tcpdump’d dialog between dovecot and solr obtaining the http call done by dovecot.
I tried the same call via telnet using “Content-type: application/x-www-form-urlencoded” and I got no errors. So I recompiled dovecot changing “text/xml” to “application/x-www-form-urlencoded” in file src/plugins/fts-solr/solr-connection.c. Pay attention, there are two occurrences to substitute.
Many thanks! Was working of the info on the dovecot site and other places using older version of Solr and not having any joy. Worked first time using the info you’ve posted here, good job!
thanks! Andy.
No problem, glad you found it useful 🙂
James
hi,
I’m using ubuntu 16.04 and follow your post with dovecot 2.2 and solr 6.1.0. But i can’t make it run.
I met the error:
doveadm(tungnt@msapp.xyz): Info: INBOX: Caching mails seq=1..23
doveadm(tungnt@msapp.xyz): Error: fts_solr: Indexing failed: 404 Not Found
doveadm(tungnt@msapp.xyz): Panic: file http-client-request.c: line 723 (http_client_request_send_payload): assertion failed: (ret == 0)
when i running command:
doveadm -v index -u tungnt@msapp.xyz Inbox
Do you have any idea for me, pls
Thanks!
Hi,
What output are you getting when you just run:-
/usr/local/bin/curl “http://localhost:8983/solr/dovecot/update?optimize=true”
James
Hi,
why do you suggest to remove the section?
Thanks
The AddSchemaFieldsUpdateProcessorFactory section? That’s a newer schema style in solr as far as I know which I personally don’t have a schema for to use with Dovecot. If someone has a schema that will work with it feel free to share 🙂