Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # dev >> HBase 0.92/Hadoop 0.22 test results


+
Roman Shaposhnik 2011-11-03, 22:34
+
Stack 2011-11-03, 22:48
+
Ted Yu 2011-11-03, 22:52
+
Shvachko, Konstantin 2011-11-04, 00:35
+
Roman Shaposhnik 2011-11-03, 23:01
+
Andrew Purtell 2011-11-03, 23:15
+
Ted Yu 2011-11-03, 23:37
+
Stack 2011-11-05, 04:02
+
Roman Shaposhnik 2011-11-05, 23:28
+
Ted Yu 2011-11-06, 00:36
+
Stack 2011-11-06, 22:33
+
Roman Shaposhnik 2011-11-07, 00:12
+
Roman Shaposhnik 2011-11-07, 02:38
+
Ted Yu 2011-11-07, 05:00
+
Ted Yu 2011-11-07, 05:54
+
Roman Shaposhnik 2011-11-08, 06:16
+
Roman Shaposhnik 2011-11-08, 06:37
+
Ted Yu 2011-11-08, 17:20
+
Roman Shaposhnik 2011-11-08, 17:26
+
Ted Yu 2011-11-08, 17:33
+
Stack 2011-11-08, 22:06
+
Roman Shaposhnik 2011-11-08, 22:29
+
Roman Shaposhnik 2011-11-09, 00:10
+
Ted Yu 2011-11-09, 00:20
+
Roman Shaposhnik 2011-11-09, 00:27
+
Roman Shaposhnik 2011-11-09, 19:44
+
Todd Lipcon 2011-11-09, 21:38
+
Andrew Purtell 2011-11-10, 15:25
Copy link to this message
-
Re: HBase 0.92/Hadoop 0.22 test results
Guys,

In the log file attached by Roman I see the exception below, which I
think is the reason for failures.
It says "could only be replicated to 0 nodes, instead of 1", and means
that HDFS could not find any targets for the block.
This could happen either if the disks are full, which I think Roman
took care of.
Or if there is a spike in new block creations or generally write activity.
If there is instantaneous increase in block creation, the load will go
over 2 * avgLoad, because DataNodes do not keep up with reporting
their load. It would be interesting to turn on debug level, Then we
should see why locations are not being chosen.
It may also make sense to run the same test with
dfs.namenode.replication.considerLoad = false
Then average load will not be taken into account. By default it is true.

11/11/08 18:50:04 WARN regionserver.SplitLogWorker: log splitting of
hdfs://ip-10-46-114-25.ec2.internal:17020/hbase/.logs/ip-10-245-191-239.ec2.internal,60020,1320792860210-splitting/ip-10-245-191-239.ec2.internal%2C60020%2C1320792860210.1320796004063
failed, returning error
java.io.IOException: File
/hbase/splitlog/domU-12-31-39-09-E8-31.compute-1.internal,60020,1320792889412_hdfs%3A%2F%2Fip-10-46-114-25.ec2.internal%3A17020%2Fhbase%2F.logs%2Fip-10-245-191-239.ec2.internal%2C60020%2C1320792860210-splitting%2Fip-10-245-191-239.ec2.internal%252C60020%252C1320792860210.1320796004063/TestLoadAndVerify_1320795370905/26bcb17daa237ce131b83e43ee48224c/recovered.edits/0000000000000048153
could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1646)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:829)
at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:349)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1482)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1478)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1153)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1476)

at org.apache.hadoop.ipc.Client.call(Client.java:1028)
at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)
at $Proxy7.addBlock(Unknown Source)
at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:84)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy7.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:975)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:847)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:447)

On Tue, Nov 8, 2011 at 4:10 PM, Roman Shaposhnik <[EMAIL PROTECTED]> wrote:
> +Konstantin (there's something weird in append handling)
>
> Some more updates. Hope this will help. I had this hunch that
> I was seeing those weird issues when HDFS DN was at 80%
> capacity (but nowhere near full!). So I quickly spun off a cluster
> that had 5 DNs with modest (and unbalanced!) amount of
> storage. Here's what started happening towards the end of
> loading 2M records into HBase:
>
> On the master:
>
> {"statustimems":-1,"status":"Waiting for distributed tasks to finish.
> scheduled=4 done=0
> error=3","starttimems":1320796207862,"description":"Doing distributed
> log split in [hdfs://ip-10-46-114-25.ec2.internal:17020/hbase/.logs/ip-10-245-191-239.ec2.internal,60020,1320792860210-splitting]","state":"RUNNING","statetimems":-1},{"statustimems":1320796275317,"status":"Waiting
+
Andrew Purtell 2011-11-10, 15:22
+
Roman Shaposhnik 2011-11-11, 00:33
+
Andrew Purtell 2011-11-11, 03:53
+
Stack 2011-11-11, 16:50
+
Roman Shaposhnik 2011-11-11, 17:12
+
Andrew Purtell 2011-11-11, 22:56
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB