After quite a bit of searching I can’t seem to find a simple example of setting up a single instance of SolrCloud on a local machine. I work with Solr every day. I do most of my development on my laptop and when everything is good I commit it and deploy. We recently hired someone who will be working with Solr and I wanted to get his laptop set up to run SolrCloud locally, too. However, I found it difficult to locate a document that I could just point him to. This is that document.
Note that I understand the installation described below is likely to be useful only when doing development. What’s the point of having a distributed SolrCloud when it’s only running on one machine?
See https://cwiki.apache.org/confluence/display/solr/SolrCloud+with+Legacy+Configuration+Files for a list of required SolrCloud configurations.
ZooKeeper (https://zookeeper.apache.org/) is centralized service that is used to coordinate configuration information. You will be telling ZooKeeper where to find SolrCloud configuration files.
SolrCloud comes with an embedded ZooKeeper. However, our production configuration uses ZooKeeper as a stand-alone system and I want to mimic production.
- Download ZooKeeper from Apache’s site https://zookeeper.apache.org/
- Extract the downloaded file.
- Follow the steps outlined in the getting started guide https://zookeeper.apache.org/doc/r3.3.4/zookeeperStarted.html. Here are the basics. Be aware that this may change with future versions of ZooKeeper.
- Copy ZOOKEEPER_DIR/conf/zoo_sample.cfg to ZOOKEEPER_DIR/conf/zoo.cfg
- I changed the value of dataDir in ZOOKEEPER_DIR/conf/zoo.cfg to an existing empty directory
- Start zookeeper: ZOOKEEPER_DIR/bin/zkServer.sh start
Verify that ZooKeeper is running:
ZOOKEEPER_DIR/bin/zkCli.sh -server 127.0.0.1:2181
You should see a command prompt that looks something like this:
[zk: 127.0.0.1:2181(CONNECTED) 0]
Enter quit to exit the client
[zk: 127.0.0.1:2181(CONNECTED) 0] quit
If you get a java.net.ConnectException: Connection refused error you know the server is not running.
ZooKeeper and Solr’s Configuration Files
Using the SOLR_DIR/example/scripts/cloud-scripts/zkcli.sh script upload Solr configuration files to ZooKeeper:
SOLR_DIR/example/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181 -cmd upconfig -confdir SOLR_DIR/example/solr/my-collection/conf -confname my-collection-config
From this you should see a bunch of output including a list of the configuration files found in the directory pointed to by the “-confdir” flag.
Start up an instance of SolrCloud:
java -jar SOLR_DIR/bin/solr start -z localhost:2181 -cloud
The “-cloud” flag tells Solr to start in a cloud configuration. The “-z localhost:2181” flag tells Solr how to connect to ZooKeeper where it will find configuration information.
You may now look at the SolrCloud admin page found here: http://localhost:8983/solr/#/
Create New Solr Collection
So far we’ve uploaded a set of Solr configuration files to ZooKeeper and started an instance of SolrCloud. Next we need to create a new Solr collection telling Solr how to find its configuration in ZooKeeper.
Notice how the value of “collection.configName” is the same as what was used in “upconfig” command that was sent to ZooKeeper: my-collection-config This tells Solr to use that name when asking ZooKeeper for the configuration for this new collection.
The “numShards” parameter is required. The documentation (https://cwiki.apache.org/confluence/display/solr/Collections+API) is a little confusing. The table says it is not required but the description says otherwise. I found that if I do not provide the “numShards” parameter the response from Solr is
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: numShards is a required param
As such, I just set the value to 1 and everything works as expected.
To make this process easier I created a few scripts which can be found here on github: https://github.com/likethecolor/solr-scripts