Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper >> mail # user >> Start-up of new Solr shard leaders or existing shard replicas depends on zookeeper?


Copy link to this message
-
Start-up of new Solr shard leaders or existing shard replicas depends on zookeeper?
Hi zookeepers,

 

I'm seeing some confusing behaviour in Solr/zookeeper and hope you can
shed some light on what's happening/how I can correct it. (I've also
asked the ~same question to the Solr mailing list but I think it's a ZK
issue.)

 

We have two physical servers running automated builds of RedHat 6.4 and
Solr 4.4.0 that host two separate Solr services. We also have two
separate ZK ensembles, a production version and a development version -
both running 3.4.5 built via automation. The first Solr server (called
ld01) has 24 shards and hosts a collection called 'ukdomain'; the second
Solr server (ld02) also has 24 shards and hosts a different collection
called 'ldwa01'. It's evidently important to note that previously both
of these physical servers provided the 'ukdomain' collection, but the
'ldwa01' server has been rebuilt for the new collection. ld01/ukdomain
connects to the production ZK. ld02/ldwa01, well that depends and is the
problem.

 

When I start the ldwa01 solr nodes with their zookeeper configuration
(defined in /etc/sysconfig/solrnode* and with collection.configName as
'ldwa01cfg') pointing to the development zookeeper ensemble, all nodes
initially become shard leaders and then replicas as I'd expect. But if I
change the ldwa01 solr nodes to point to the zookeeper ensemble also
used for the ukdomain collection, all ldwa01 solr nodes start on the
same shard (that is, the first ldwa01 solr node becomes the shard
leader, then every other solr node becomes a replica for this shard).
The significant point here is no other ldwa01 shards gain leaders (or
replicas).

 

The ukdomain collection uses a zookeeper collection.configName of
'ukdomaincfg', and prior to the creation of this ldwa01 service the
collection.configName of 'ldwa01cfg' has never previously been used. So
I'm confused why the ldwa01 service would differ when the only
difference is which zookeeper ensemble is used.

 

If anyone can explain why this is happening and how I can get the ldwa01
services to start correctly using the non-development zookeeper
ensemble, I'd be very grateful! If more information or explanation is
needed, just ask.

 

Thanks, Gil

 

Gil Hoggarth

Web Archiving Technical Services Engineer

The British Library, Boston Spa, West Yorkshire, LS23 7BQ

 

 

 

+
Hoggarth, Gil 2013-10-24, 15:33
+
Patrick Hunt 2013-10-25, 20:45
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB