Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # user - HBase dies after some time


Copy link to this message
-
Re: HBase dies after some time
Harsh J 2012-05-30, 07:33
You may colocate your ZK with the HBase Master as its not very heavy.
Depending on your cluster size, 1-3 may be enough and you can divide
it among HBM, SNN and perhaps NN/JT machines.

On Wed, May 30, 2012 at 2:54 AM, Something Something
<[EMAIL PROTECTED]> wrote:
> Hmm.. due to budget constraints, I am forced to install ZooKeeper on the
> same machine that runs TaskTracker.  When a big MR job starts it fires up
> over 40 tasks, so as you implied this could definitely be related to memory.
>
> Should ZooKeepers be started on their own machines?  Right now I have
> ZooKeeper, HRegionServer & TaskTracker running on the same machine.  This
> is a bad idea, right?  Is there any way to get ZooKeeper working under
> these restrictions?
>
> By the way, the ZooKeeper log shows this:
>
> 2012-05-29 13:56:54,842 - ERROR [CommitProcessor:2:NIOServerCnxn@445] -
> Unexpected Exception:
> java.nio.channels.CancelledKeyException
>        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
>        at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
>        at
> org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:418)
>        at
> org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1509)
>        at
> org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367)
>        at
> org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)
>
>
>
>
> On Sat, May 26, 2012 at 2:28 AM, Christian Schäfer
> <[EMAIL PROTECTED]>wrote:
>
>>
>> Hi,
>>
>>  I got exactly the same behaviour and exceptions that you mention on a
>> local cluster.
>>
>> In my case the sum of all services' heapspace was higher than the actual
>> memory of the machine.
>> At
>>  first sum the heapspaces of your master machine likely running
>> NameNode, HMaster, ZooKeeper, and maybe also, RegionServer and DataNode
>> Then check that this sum is lesser than your master machines memory.
>>
>> Good Luck.
>> Chris
>>
>>        Von: Something Something <[EMAIL PROTECTED]>
>>  An:
>>  [EMAIL PROTECTED]; [EMAIL PROTECTED]
>>  Gesendet: 3:22 Samstag, 26.Mai 2012
>>  Betreff: HBase dies after some time
>>
>> Hello,
>>
>> I recently installed ZooKeeper & HBase on our dedicated Hadoop cluster on
>> EC2.  The HBase stays active for some time, but after a while it dies with
>> error messages similar to these:
>>
>> 2012-05-25 12:09:27,514 ERROR
>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher:
>> master:60000-0x5378489312c0004-0x5378489312c0004 Received unexpected
>> KeeperException, re-throwing exception
>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>> KeeperErrorCode = ConnectionLoss for /hbase/master
>>        at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>>
>>  at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>>        at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927)
>>        at
>> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
>>        at
>> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAsAddress(ZKUtil.java:620)
>>        at
>>
>> org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:197)
>>        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:310)
>> 2012-05-25 12:09:27,514 ERROR
>> org.apache.hadoop.hbase.master.ActiveMasterManager:
>> master:60000-0x5378489312c0004-0x5378489312c0004 Error deleting our own
>> master address node
>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>> KeeperErrorCode = ConnectionLoss for /hbase/master
>>
>>  at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>>        at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>>        at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927)
>>        at
>> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)

Harsh J