Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # user >> Zookeeper Implementation


+
Drew Thornton 2013-07-15, 18:04
+
Eric Newton 2013-07-15, 18:31
+
Drew Thornton 2013-07-15, 19:43
+
Denis 2013-07-15, 19:55
+
Drew Thornton 2013-07-16, 13:23
Copy link to this message
-
Re: Zookeeper Implementation
Confirmed.  See
ACCUMULO-1572<https://issues.apache.org/jira/browse/ACCUMULO-1572>
.

-Eric
On Tue, Jul 16, 2013 at 9:23 AM, Drew Thornton
<[EMAIL PROTECTED]>wrote:

> Thank you, but that is not the situation.
>
> If one zookeeper node is shutdown/fails/whatever and the rest of the
> ensemble stays up, the tablet servers attached as clients to the shutdown
> node immediately fail. If one of the clients happens to be the master, the
> cluster goes down.
>
> Accumulo does not seem to be failing over to the remaining zookeeper
> nodes, and this causes me to restart the individual tablet servers again.
>
> The zookeeper ensemble is very stable and has plenty of
> bandwidth/memory/processing, so taking one node down out of five doesn't
> crash the zookeepers, just the tablet servers...
>
>
>
> Drew Thornton
> Data Tactics Corporation
> [EMAIL PROTECTED]
> 571.297.2173 (w)
> 804.615.0771 (m)
>
> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of
> Denis
> Sent: Monday, July 15, 2013 3:56 PM
> To: [EMAIL PROTECTED]
> Subject: Re: Zookeeper Implementation
>
> Hi
>
> I have seen this behavior (with Accumulo 1.4.4 though) when one of
> Zookeeper nodes being restarted, then, after few seconds delay, another
> node being restarted.
>
> I did not investigate the issue, but it seems that if you want to change
> Zookeeper configuration and restart all nodes, you have to wait few minutes
> between restarts.
>
> On 7/15/13, Drew Thornton <[EMAIL PROTECTED]> wrote:
> > Yes, [ maxClientCnxns=100 ]. I've used full hostnames and ports as
> > well in Accumulo-site.
> >
> > I noticed the pattern of crashes when I was testing Zookeeper's JVM
> > garbage collector settings. I would take one node out at a time to
> > restart its JVM, and individual Tablet Servers (and eventually the
> > master) would crash depending on the Zookeeper node that I took down.
> >
> > Drew
> >
> > From: Eric Newton [mailto:[EMAIL PROTECTED]]
> > Sent: Monday, July 15, 2013 2:31 PM
> > To: [EMAIL PROTECTED]
> > Subject: Re: Zookeeper Implementation
> >
> > You are giving the names of all the zookeeper nodes in
> > accumulo-site.xml, right?
> >
> >   <property>
> >     <name>instance.zookeeper.host</name>
> >     <value>zoo1,zoo2,zoo3,zoo4,zoo5</value>
> >   </property>
> >
> > Have you increased maxClientCnxns as described in the accumulo README?
> >
> > -Eric
> >
> >
> > On Mon, Jul 15, 2013 at 2:04 PM, Drew Thornton
> > <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
> > Hello,
> >
> > I'm running a small cluster of 10 tablet servers and 5 zookeeper nodes
> > (CDH 4.3, Zookeeper 3.4.5, Accumulo 1.5.0).
> >
> > I have noticed that when a zookeeper node dies, the connected tablet
> > server clients also die instead of failing-over to another zookeeper.
> > If the clients on the failed zookeeper are only tablet servers,
> > Accumulo reassigns the tablets. If the Accumulo Master is one of the
> > clients on the failed node, then the master goes down and the cluster
> with it.
> >
> > Anyone else have this problem or know of a workaround/solution to keep
> > the cluster up when zookeeper changes state?
> >
> > Thanks,
> > Drew
> >
> >
> >
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB