Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Replication hosed after simple cluster restart


Copy link to this message
-
Re: Replication hosed after simple cluster restart
Not sure I follow.  Is this our making use of multi against a zk ensemble
that doesn't support it?
On Mar 13, 2013 6:22 PM, "lars hofhansl" <[EMAIL PROTECTED]> wrote:

> I suppose the problem could be in
> zkHelper.copyQueuesFromRSUsingMulti(rsZnode) as called from
> ReplicationSourceManager.NodeFailoverWorker.run().
> copyQueuesFromRSUsingMulti will return the queues it read even when the
> multi operation failed (because another RS managed to execute it first).
>
> -- Lars
>
>
>
> ________________________________
>  From: lars hofhansl <[EMAIL PROTECTED]>
> To: hbase-dev <[EMAIL PROTECTED]>
> Sent: Wednesday, March 13, 2013 6:12 PM
> Subject: Replication hosed after simple cluster restart
>
> We just ran into an interesting scenario. We restarted a cluster that was
> setup as a replication source.
> The stop went cleanly.
>
> Upon restart *all* regionservers aborted within a few seconds with
> variations of these errors:
> http://pastebin.com/3iQVuBqS
>
> This is scary!
>
> -- Lars