|
|
Ben Cuthbert 2012-05-20, 07:12
All
We run a load test and after about 3 hours our application stopped. Check the logs I see this in the hbase-master log
2012-05-20 08:08:17,251 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been OFFLINE for too long, reassigning -ROOT-,,0.70236052 to a random server 2012-05-20 08:08:17,252 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=.META.,,1.1028785192 state=OFFLINE, ts=1337497517243 2012-05-20 08:08:17,252 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=-ROOT-,,0.70236052 state=OFFLINE, ts=1337497517243 2012-05-20 08:10:10,309 INFO org.apache.zookeeper.server.NIOServerCnxn: Accepted socket connection from /0:0:0:0:0:0:0:1%0:62747 2012-05-20 08:10:10,315 INFO org.apache.zookeeper.server.NIOServerCnxn: Client attempting to establish new session at /0:0:0:0:0:0:0:1%0:62747 2012-05-20 08:10:10,316 INFO org.apache.zookeeper.server.NIOServerCnxn: Established session 0x137653a0e8e02fa with negotiated timeout 40000 for client /0:0:0:0:0:0:0:1%0:62747 2012-05-20 08:10:10,316 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x1 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase Error:KeeperErrorCode = NodeExists for /hbase 2012-05-20 08:10:10,329 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x2 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/unassigned Error:KeeperErrorCode = NodeExists for /hbase/unassigned 2012-05-20 08:10:10,329 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x3 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/rs Error:KeeperErrorCode = NodeExists for /hbase/rs 2012-05-20 08:10:10,330 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x4 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/table Error:KeeperErrorCode = NodeExists for /hbase/table Hadoop seems to be up and running.
last log in the datanode is
12/05/20 06:15:25 INFO datanode.DataBlockScanner: Verification succeeded for blk_-3639294708473848144_3329 12/05/20 06:26:20 INFO datanode.DataBlockScanner: Verification succeeded for blk_2502932128500788221_3413 12/05/20 06:26:20 INFO datanode.DataBlockScanner: Verification succeeded for blk_3390059684225099859_3440 12/05/20 06:59:32 INFO datanode.DataNode: BlockReport of 157 blocks took 19 msec to generate and 3 msecs for RPC and NN processing 12/05/20 07:24:51 INFO datanode.DataBlockScanner: Verification succeeded for blk_8954400942867609419_3363 12/05/20 07:55:51 INFO datanode.DataBlockScanner: Verification succeeded for blk_-3650918785526360502_3387 12/05/20 07:59:33 INFO datanode.DataNode: BlockReport of 157 blocks took 20 msec to generate and 3 msecs for RPC and NN processing 12/05/20 08:07:25 INFO datanode.DataBlockScanner: Verification succeeded for blk_786514597978592338_3336
I tried using hbase-explorer to view the tables but they all seem to down.
Michael Segel 2012-05-20, 15:02
What did you see when you ran the HBase shell's status? Did you run status w higher details? (see status help) On May 20, 2012, at 2:12 AM, Ben Cuthbert wrote:
> All > > We run a load test and after about 3 hours our application stopped. Check the logs I see this in the hbase-master log > > 2012-05-20 08:08:17,251 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been OFFLINE for too long, reassigning -ROOT-,,0.70236052 to a random server > 2012-05-20 08:08:17,252 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=.META.,,1.1028785192 state=OFFLINE, ts=1337497517243 > 2012-05-20 08:08:17,252 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=-ROOT-,,0.70236052 state=OFFLINE, ts=1337497517243 > 2012-05-20 08:10:10,309 INFO org.apache.zookeeper.server.NIOServerCnxn: Accepted socket connection from /0:0:0:0:0:0:0:1%0:62747 > 2012-05-20 08:10:10,315 INFO org.apache.zookeeper.server.NIOServerCnxn: Client attempting to establish new session at /0:0:0:0:0:0:0:1%0:62747 > 2012-05-20 08:10:10,316 INFO org.apache.zookeeper.server.NIOServerCnxn: Established session 0x137653a0e8e02fa with negotiated timeout 40000 for client /0:0:0:0:0:0:0:1%0:62747 > 2012-05-20 08:10:10,316 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x1 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase Error:KeeperErrorCode = NodeExists for /hbase > 2012-05-20 08:10:10,329 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x2 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/unassigned Error:KeeperErrorCode = NodeExists for /hbase/unassigned > 2012-05-20 08:10:10,329 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x3 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/rs Error:KeeperErrorCode = NodeExists for /hbase/rs > 2012-05-20 08:10:10,330 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x4 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/table Error:KeeperErrorCode = NodeExists for /hbase/table > > > Hadoop seems to be up and running. > > last log in the datanode is > > 12/05/20 06:15:25 INFO datanode.DataBlockScanner: Verification succeeded for blk_-3639294708473848144_3329 > 12/05/20 06:26:20 INFO datanode.DataBlockScanner: Verification succeeded for blk_2502932128500788221_3413 > 12/05/20 06:26:20 INFO datanode.DataBlockScanner: Verification succeeded for blk_3390059684225099859_3440 > 12/05/20 06:59:32 INFO datanode.DataNode: BlockReport of 157 blocks took 19 msec to generate and 3 msecs for RPC and NN processing > 12/05/20 07:24:51 INFO datanode.DataBlockScanner: Verification succeeded for blk_8954400942867609419_3363 > 12/05/20 07:55:51 INFO datanode.DataBlockScanner: Verification succeeded for blk_-3650918785526360502_3387 > 12/05/20 07:59:33 INFO datanode.DataNode: BlockReport of 157 blocks took 20 msec to generate and 3 msecs for RPC and NN processing > 12/05/20 08:07:25 INFO datanode.DataBlockScanner: Verification succeeded for blk_786514597978592338_3336 > > I tried using hbase-explorer to view the tables but they all seem to down.
Marcos Ortiz 2012-05-20, 15:56
Well, reading your hbase-master.log, it seems that Zookeeper is trying to access to several paths and they are not in the cluster: /hbase/rs /hbase/unassigned Zookeeper is running? HBase is runnig? Which version of Zookeeper and HBase are you using? On 05/20/2012 11:02 AM, Michael Segel wrote: > What did you see when you ran the HBase shell's status? > Did you run status w higher details? > (see status help) > > > On May 20, 2012, at 2:12 AM, Ben Cuthbert wrote: > >> All >> >> We run a load test and after about 3 hours our application stopped. Check the logs I see this in the hbase-master log >> >> 2012-05-20 08:08:17,251 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been OFFLINE for too long, reassigning -ROOT-,,0.70236052 to a random server >> 2012-05-20 08:08:17,252 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=.META.,,1.1028785192 state=OFFLINE, ts=1337497517243 >> 2012-05-20 08:08:17,252 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=-ROOT-,,0.70236052 state=OFFLINE, ts=1337497517243 >> 2012-05-20 08:10:10,309 INFO org.apache.zookeeper.server.NIOServerCnxn: Accepted socket connection from /0:0:0:0:0:0:0:1%0:62747 >> 2012-05-20 08:10:10,315 INFO org.apache.zookeeper.server.NIOServerCnxn: Client attempting to establish new session at /0:0:0:0:0:0:0:1%0:62747 >> 2012-05-20 08:10:10,316 INFO org.apache.zookeeper.server.NIOServerCnxn: Established session 0x137653a0e8e02fa with negotiated timeout 40000 for client /0:0:0:0:0:0:0:1%0:62747 >> 2012-05-20 08:10:10,316 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x1 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase Error:KeeperErrorCode = NodeExists for /hbase >> 2012-05-20 08:10:10,329 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x2 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/unassigned Error:KeeperErrorCode = NodeExists for /hbase/unassigned >> 2012-05-20 08:10:10,329 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x3 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/rs Error:KeeperErrorCode = NodeExists for /hbase/rs >> 2012-05-20 08:10:10,330 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x4 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/table Error:KeeperErrorCode = NodeExists for /hbase/table >> >> >> Hadoop seems to be up and running. >> >> last log in the datanode is >> >> 12/05/20 06:15:25 INFO datanode.DataBlockScanner: Verification succeeded for blk_-3639294708473848144_3329 >> 12/05/20 06:26:20 INFO datanode.DataBlockScanner: Verification succeeded for blk_2502932128500788221_3413 >> 12/05/20 06:26:20 INFO datanode.DataBlockScanner: Verification succeeded for blk_3390059684225099859_3440 >> 12/05/20 06:59:32 INFO datanode.DataNode: BlockReport of 157 blocks took 19 msec to generate and 3 msecs for RPC and NN processing >> 12/05/20 07:24:51 INFO datanode.DataBlockScanner: Verification succeeded for blk_8954400942867609419_3363 >> 12/05/20 07:55:51 INFO datanode.DataBlockScanner: Verification succeeded for blk_-3650918785526360502_3387 >> 12/05/20 07:59:33 INFO datanode.DataNode: BlockReport of 157 blocks took 20 msec to generate and 3 msecs for RPC and NN processing >> 12/05/20 08:07:25 INFO datanode.DataBlockScanner: Verification succeeded for blk_786514597978592338_3336 >> >> I tried using hbase-explorer to view the tables but they all seem to down. > > 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS... > CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION > > http://www.uci.cuMarcos Luis Ort�z Valmaseda Data Engineer&& Sr. System Administrator at UCI http://marcosluis2186.posterous.com http://www.linkedin.com/in/marcosluis2186 Twitter: @marcosluis2186 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS... CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION http://www.uci.cuhttp://www.facebook.com/universidad.ucihttp://www.flickr.com/photos/universidad_uci
Ben Cuthbert 2012-05-20, 15:56
I will try again as I did not run that. I just saw this error when trying to use hbase-explorer to connect. On 20 May 2012, at 16:02, Michael Segel wrote:
> What did you see when you ran the HBase shell's status? > Did you run status w higher details? > (see status help) > > > On May 20, 2012, at 2:12 AM, Ben Cuthbert wrote: > >> All >> >> We run a load test and after about 3 hours our application stopped. Check the logs I see this in the hbase-master log >> >> 2012-05-20 08:08:17,251 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been OFFLINE for too long, reassigning -ROOT-,,0.70236052 to a random server >> 2012-05-20 08:08:17,252 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=.META.,,1.1028785192 state=OFFLINE, ts=1337497517243 >> 2012-05-20 08:08:17,252 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=-ROOT-,,0.70236052 state=OFFLINE, ts=1337497517243 >> 2012-05-20 08:10:10,309 INFO org.apache.zookeeper.server.NIOServerCnxn: Accepted socket connection from /0:0:0:0:0:0:0:1%0:62747 >> 2012-05-20 08:10:10,315 INFO org.apache.zookeeper.server.NIOServerCnxn: Client attempting to establish new session at /0:0:0:0:0:0:0:1%0:62747 >> 2012-05-20 08:10:10,316 INFO org.apache.zookeeper.server.NIOServerCnxn: Established session 0x137653a0e8e02fa with negotiated timeout 40000 for client /0:0:0:0:0:0:0:1%0:62747 >> 2012-05-20 08:10:10,316 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x1 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase Error:KeeperErrorCode = NodeExists for /hbase >> 2012-05-20 08:10:10,329 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x2 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/unassigned Error:KeeperErrorCode = NodeExists for /hbase/unassigned >> 2012-05-20 08:10:10,329 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x3 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/rs Error:KeeperErrorCode = NodeExists for /hbase/rs >> 2012-05-20 08:10:10,330 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x4 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/table Error:KeeperErrorCode = NodeExists for /hbase/table >> >> >> Hadoop seems to be up and running. >> >> last log in the datanode is >> >> 12/05/20 06:15:25 INFO datanode.DataBlockScanner: Verification succeeded for blk_-3639294708473848144_3329 >> 12/05/20 06:26:20 INFO datanode.DataBlockScanner: Verification succeeded for blk_2502932128500788221_3413 >> 12/05/20 06:26:20 INFO datanode.DataBlockScanner: Verification succeeded for blk_3390059684225099859_3440 >> 12/05/20 06:59:32 INFO datanode.DataNode: BlockReport of 157 blocks took 19 msec to generate and 3 msecs for RPC and NN processing >> 12/05/20 07:24:51 INFO datanode.DataBlockScanner: Verification succeeded for blk_8954400942867609419_3363 >> 12/05/20 07:55:51 INFO datanode.DataBlockScanner: Verification succeeded for blk_-3650918785526360502_3387 >> 12/05/20 07:59:33 INFO datanode.DataNode: BlockReport of 157 blocks took 20 msec to generate and 3 msecs for RPC and NN processing >> 12/05/20 08:07:25 INFO datanode.DataBlockScanner: Verification succeeded for blk_786514597978592338_3336 >> >> I tried using hbase-explorer to view the tables but they all seem to down. >
Ben Cuthbert 2012-05-20, 16:12
So hbase and hadoop are running fine, but we wanted to test our application performance. So we ran some test cases for about 7 hours sending in events every 200ms to generate some load. After the 7 hours the application server could not connect to zookeeper, and when I checked the logs this is what I saw. So the application functions just not when we ran the test.
Config is hadoop: 0.20.203.0 hbase: 0.90.3
So I am just trying to upgrade to
hadoop: 1.0.3 hbase: 0.92.1
Then going to run the same test again. On 20 May 2012, at 16:56, Ben Cuthbert wrote:
> I will try again as I did not run that. I just saw this error when trying to use hbase-explorer to connect. > > > On 20 May 2012, at 16:02, Michael Segel wrote: > >> What did you see when you ran the HBase shell's status? >> Did you run status w higher details? >> (see status help) >> >> >> On May 20, 2012, at 2:12 AM, Ben Cuthbert wrote: >> >>> All >>> >>> We run a load test and after about 3 hours our application stopped. Check the logs I see this in the hbase-master log >>> >>> 2012-05-20 08:08:17,251 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been OFFLINE for too long, reassigning -ROOT-,,0.70236052 to a random server >>> 2012-05-20 08:08:17,252 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=.META.,,1.1028785192 state=OFFLINE, ts=1337497517243 >>> 2012-05-20 08:08:17,252 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=-ROOT-,,0.70236052 state=OFFLINE, ts=1337497517243 >>> 2012-05-20 08:10:10,309 INFO org.apache.zookeeper.server.NIOServerCnxn: Accepted socket connection from /0:0:0:0:0:0:0:1%0:62747 >>> 2012-05-20 08:10:10,315 INFO org.apache.zookeeper.server.NIOServerCnxn: Client attempting to establish new session at /0:0:0:0:0:0:0:1%0:62747 >>> 2012-05-20 08:10:10,316 INFO org.apache.zookeeper.server.NIOServerCnxn: Established session 0x137653a0e8e02fa with negotiated timeout 40000 for client /0:0:0:0:0:0:0:1%0:62747 >>> 2012-05-20 08:10:10,316 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x1 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase Error:KeeperErrorCode = NodeExists for /hbase >>> 2012-05-20 08:10:10,329 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x2 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/unassigned Error:KeeperErrorCode = NodeExists for /hbase/unassigned >>> 2012-05-20 08:10:10,329 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x3 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/rs Error:KeeperErrorCode = NodeExists for /hbase/rs >>> 2012-05-20 08:10:10,330 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x4 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/table Error:KeeperErrorCode = NodeExists for /hbase/table >>> >>> >>> Hadoop seems to be up and running. >>> >>> last log in the datanode is >>> >>> 12/05/20 06:15:25 INFO datanode.DataBlockScanner: Verification succeeded for blk_-3639294708473848144_3329 >>> 12/05/20 06:26:20 INFO datanode.DataBlockScanner: Verification succeeded for blk_2502932128500788221_3413 >>> 12/05/20 06:26:20 INFO datanode.DataBlockScanner: Verification succeeded for blk_3390059684225099859_3440 >>> 12/05/20 06:59:32 INFO datanode.DataNode: BlockReport of 157 blocks took 19 msec to generate and 3 msecs for RPC and NN processing >>> 12/05/20 07:24:51 INFO datanode.DataBlockScanner: Verification succeeded for blk_8954400942867609419_3363 >>> 12/05/20 07:55:51 INFO datanode.DataBlockScanner: Verification succeeded for blk_-3650918785526360502_3387
Marcos Ortiz 2012-05-20, 16:55
Well, you can see this link from GBIF about HBase performance evaluation: * http://gbif.blogspot.com/2012/02/performance-evaluation-of-hbase.htmlTo upgrade to 0.92.1, just install the new version, shutdown the old cluster and start the new version of the cluster. * Remeber to upgrade Zookeeper too, specifically to the 3.4.3 version. * Remember to upgrade to the 0.92.1 version, you can't use a rolling start, because the wire protocol changed in this version, so keep in mind, that you will a downtime depending of your cluster/data size. Regards On 05/20/2012 12:12 PM, Ben Cuthbert wrote: > So hbase and hadoop are running fine, but we wanted to test our application performance. So we ran some test cases for about 7 hours sending in events every 200ms to generate some load. > After the 7 hours the application server could not connect to zookeeper, and when I checked the logs this is what I saw. So the application functions just not when we ran the test. > > Config is > hadoop: 0.20.203.0 > hbase: 0.90.3 > > So I am just trying to upgrade to > > hadoop: 1.0.3 > hbase: 0.92.1 > > Then going to run the same test again. > > > On 20 May 2012, at 16:56, Ben Cuthbert wrote: > >> I will try again as I did not run that. I just saw this error when trying to use hbase-explorer to connect. >> >> >> On 20 May 2012, at 16:02, Michael Segel wrote: >> >>> What did you see when you ran the HBase shell's status? >>> Did you run status w higher details? >>> (see status help) >>> >>> >>> On May 20, 2012, at 2:12 AM, Ben Cuthbert wrote: >>> >>>> All >>>> >>>> We run a load test and after about 3 hours our application stopped. Check the logs I see this in the hbase-master log >>>> >>>> 2012-05-20 08:08:17,251 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been OFFLINE for too long, reassigning -ROOT-,,0.70236052 to a random server >>>> 2012-05-20 08:08:17,252 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=.META.,,1.1028785192 state=OFFLINE, ts=1337497517243 >>>> 2012-05-20 08:08:17,252 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=-ROOT-,,0.70236052 state=OFFLINE, ts=1337497517243 >>>> 2012-05-20 08:10:10,309 INFO org.apache.zookeeper.server.NIOServerCnxn: Accepted socket connection from /0:0:0:0:0:0:0:1%0:62747 >>>> 2012-05-20 08:10:10,315 INFO org.apache.zookeeper.server.NIOServerCnxn: Client attempting to establish new session at /0:0:0:0:0:0:0:1%0:62747 >>>> 2012-05-20 08:10:10,316 INFO org.apache.zookeeper.server.NIOServerCnxn: Established session 0x137653a0e8e02fa with negotiated timeout 40000 for client /0:0:0:0:0:0:0:1%0:62747 >>>> 2012-05-20 08:10:10,316 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x1 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase Error:KeeperErrorCode = NodeExists for /hbase >>>> 2012-05-20 08:10:10,329 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x2 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/unassigned Error:KeeperErrorCode = NodeExists for /hbase/unassigned >>>> 2012-05-20 08:10:10,329 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x3 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/rs Error:KeeperErrorCode = NodeExists for /hbase/rs >>>> 2012-05-20 08:10:10,330 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x4 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/table Error:KeeperErrorCode = NodeExists for /hbase/table >>>> >>>> >>>> Hadoop seems to be up and running. >>>> >>>> last log in the datanode is >>>> >>>> 12/05/20 06:15:25 INFO datanode.DataBlockScanner: Verification succeeded for blk_-3639294708473848144_3329 Marcos Luis Ort�z Valmaseda Data Engineer&& Sr. System Administrator at UCI http://marcosluis2186.posterous.com http://www.linkedin.com/in/marcosluis2186 Twitter: @marcosluis2186 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS... CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION http://www.uci.cuhttp://www.facebook.com/universidad.ucihttp://www.flickr.com/photos/universidad_uci
Ben Cuthbert 2012-05-20, 16:57
Thanks will try and do an upgrade on the zookeeper and UAT cluster see if it changes anything. On 20 May 2012, at 17:12, Ben Cuthbert wrote:
> So hbase and hadoop are running fine, but we wanted to test our application performance. So we ran some test cases for about 7 hours sending in events every 200ms to generate some load. > After the 7 hours the application server could not connect to zookeeper, and when I checked the logs this is what I saw. So the application functions just not when we ran the test. > > Config is > hadoop: 0.20.203.0 > hbase: 0.90.3 > > So I am just trying to upgrade to > > hadoop: 1.0.3 > hbase: 0.92.1 > > Then going to run the same test again. > > > On 20 May 2012, at 16:56, Ben Cuthbert wrote: > >> I will try again as I did not run that. I just saw this error when trying to use hbase-explorer to connect. >> >> >> On 20 May 2012, at 16:02, Michael Segel wrote: >> >>> What did you see when you ran the HBase shell's status? >>> Did you run status w higher details? >>> (see status help) >>> >>> >>> On May 20, 2012, at 2:12 AM, Ben Cuthbert wrote: >>> >>>> All >>>> >>>> We run a load test and after about 3 hours our application stopped. Check the logs I see this in the hbase-master log >>>> >>>> 2012-05-20 08:08:17,251 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been OFFLINE for too long, reassigning -ROOT-,,0.70236052 to a random server >>>> 2012-05-20 08:08:17,252 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=.META.,,1.1028785192 state=OFFLINE, ts=1337497517243 >>>> 2012-05-20 08:08:17,252 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=-ROOT-,,0.70236052 state=OFFLINE, ts=1337497517243 >>>> 2012-05-20 08:10:10,309 INFO org.apache.zookeeper.server.NIOServerCnxn: Accepted socket connection from /0:0:0:0:0:0:0:1%0:62747 >>>> 2012-05-20 08:10:10,315 INFO org.apache.zookeeper.server.NIOServerCnxn: Client attempting to establish new session at /0:0:0:0:0:0:0:1%0:62747 >>>> 2012-05-20 08:10:10,316 INFO org.apache.zookeeper.server.NIOServerCnxn: Established session 0x137653a0e8e02fa with negotiated timeout 40000 for client /0:0:0:0:0:0:0:1%0:62747 >>>> 2012-05-20 08:10:10,316 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x1 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase Error:KeeperErrorCode = NodeExists for /hbase >>>> 2012-05-20 08:10:10,329 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x2 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/unassigned Error:KeeperErrorCode = NodeExists for /hbase/unassigned >>>> 2012-05-20 08:10:10,329 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x3 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/rs Error:KeeperErrorCode = NodeExists for /hbase/rs >>>> 2012-05-20 08:10:10,330 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x4 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/table Error:KeeperErrorCode = NodeExists for /hbase/table >>>> >>>> >>>> Hadoop seems to be up and running. >>>> >>>> last log in the datanode is >>>> >>>> 12/05/20 06:15:25 INFO datanode.DataBlockScanner: Verification succeeded for blk_-3639294708473848144_3329 >>>> 12/05/20 06:26:20 INFO datanode.DataBlockScanner: Verification succeeded for blk_2502932128500788221_3413 >>>> 12/05/20 06:26:20 INFO datanode.DataBlockScanner: Verification succeeded for blk_3390059684225099859_3440 >>>> 12/05/20 06:59:32 INFO datanode.DataNode: BlockReport of 157 blocks took 19 msec to generate and 3 msecs for RPC and NN processing >>>> 12/05/20 07:24:51 INFO datanode.DataBlockScanner: Verification succeeded for blk_8954400942867609419_3363
|
|