Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # dev >> [VOTE] The 1st hbase-0.96.0 release candidate is available for download


+
Stack 2013-08-31, 03:35
+
Jean-Marc Spaggiari 2013-08-31, 11:15
+
Devaraj Das 2013-08-31, 18:23
+
Stack 2013-08-31, 22:17
+
Enis Söztutar 2013-08-31, 22:16
+
Stack 2013-08-31, 22:13
+
Jean-Marc Spaggiari 2013-09-02, 16:41
+
Stack 2013-09-02, 17:00
+
Jean-Marc Spaggiari 2013-09-02, 17:20
+
Jean-Marc Spaggiari 2013-09-02, 17:51
+
Stack 2013-09-03, 13:06
+
Jean-Marc Spaggiari 2013-09-03, 13:57
+
Elliott Clark 2013-09-03, 19:56
+
Devaraj Das 2013-09-03, 20:19
+
Devaraj Das 2013-09-04, 01:17
+
Devaraj Das 2013-09-04, 01:30
Copy link to this message
-
Re: [VOTE] The 1st hbase-0.96.0 release candidate is available for download
Yeah I got that for a while as well.  Though once online schema change
was disabled I can complete the reduce step with 2.5 G of heap (still
failing :-/).

On Tue, Sep 3, 2013 at 6:30 PM, Devaraj Das <[EMAIL PROTECTED]> wrote:
> Well from the test code it seems like the problem is due to the fact that
> the reducer got unexpected data and it was trying to construct the log
> message for the user. So the job had already failed in reality.
>
>
> On Tue, Sep 3, 2013 at 6:17 PM, Devaraj Das <[EMAIL PROTECTED]> wrote:
>
>> Elliott, what are the heap sizes of the M/R tasks in your setup. I was
>> running the job like this (without chaosmonkey to start with):
>>
>> hbase org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList Loop 5 12
>> 2500000 IntegrationTestBigLinkedList 10
>>
>> Even the above test failed with one reduce task failing with OOM, in the
>> verify step. The heap size was set to 3G.
>>
>> 2013-09-04 01:11:56,054 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: Java heap space
>>       at java.util.Arrays.copyOf(Arrays.java:2882)
>>       at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
>>       at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)
>>       at java.lang.StringBuilder.append(StringBuilder.java:119)
>>       at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Verify$VerifyReducer.reduce(IntegrationTestBigLinkedList.java:576)
>>       at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Verify$VerifyReducer.reduce(IntegrationTestBigLinkedList.java:547)
>>       at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
>>       at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:645)
>>       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:405)
>>       at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>       at java.security.AccessController.doPrivileged(Native Method)
>>       at javax.security.auth.Subject.doAs(Subject.java:396)
>>       at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477)
>>       at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>>
>>
>>
>>
>>
>> On Tue, Sep 3, 2013 at 12:56 PM, Elliott Clark <[EMAIL PROTECTED]> wrote:
>>
>>> Can someone take a look at running Test Big Linked for > 5 iterations
>>> with slowDeterministic chaos monkey on a distributed cluster.  I'm
>>> pretty concerned about HBASE-9338
>>>
>>> On Tue, Sep 3, 2013 at 6:57 AM, Jean-Marc Spaggiari
>>> <[EMAIL PROTECTED]> wrote:
>>> > There was a typo in my log4j.properties :(
>>> >
>>> > So it's working fine.
>>> >
>>> > The only INFO logs I still see are those one:
>>> > 2013-09-03 09:53:07,313 INFO  [M:0;t430s:45176] mortbay.log: Logging to
>>> > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>> > org.mortbay.log.Slf4jLog
>>> > 2013-09-03 09:53:07,350 INFO  [M:0;t430s:45176] mortbay.log:
>>> jetty-6.1.26
>>> > But there is only very few of them.
>>> >
>>> > Performances wise, here are the numbers (the higher, the better. Rows
>>> per
>>> > seconds, expect for scans where it's rows/min). As you will see, 0.96 is
>>> > slower only for RandomSeekScanTest (way slower) and
>>> RandomScanWithRange10
>>> > but is faster for everything else. I ran the tests with the default
>>> > settings. So I think we should look at RandomSeekScanTest but expect
>>> this
>>> > one, everything else is pretty good.
>>> >
>>> > Also, I have been able to reproduce this exception:
>>> > 2013-09-03 09:55:36,718 WARN  [NIOServerCxn.Factory:
>>> 0.0.0.0/0.0.0.0:2181]
>>> > server.NIOServerCnxn: caught end of stream exception
>>> > EndOfStreamException: Unable to read additional data from client
>>> sessionid
>>> > 0x140e4191edb0009, likely client has closed socket
>>> >     at
>>> > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
>>> >     at
>>> >
>>> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
+
Stack 2013-09-04, 00:09
+
Jean-Marc Spaggiari 2013-09-04, 02:35
+
Stack 2013-09-05, 00:34
+
Elliott Clark 2013-09-06, 20:42
+
Stack 2013-09-07, 08:19
+
Nicolas Liochon 2013-09-09, 17:34
+
Jonathan Hsieh 2013-09-09, 20:45
+
Stack 2013-09-09, 21:24
+
Enis Söztutar 2013-09-09, 21:11
+
Nick Dimiduk 2013-09-09, 17:33
+
Stack 2013-09-11, 17:19
+
Devaraj Das 2013-09-11, 17:51
+
Sergey Shelukhin 2013-09-11, 18:11
+
Enis Söztutar 2013-09-11, 18:18
+
Nick Dimiduk 2013-09-11, 18:22
+
Stack 2013-09-11, 20:26
+
Stack 2013-09-16, 15:38
+
Nicolas Liochon 2013-09-16, 16:26
+
Stack 2013-09-16, 19:00
+
Enis Söztutar 2013-09-16, 18:40
+
Devaraj Das 2013-09-16, 20:34
+
Stack 2013-09-17, 00:10
+
Devaraj Das 2013-09-17, 00:13
+
Stack 2013-09-17, 00:22
+
Stack 2013-09-17, 06:17
+
Nick Dimiduk 2013-09-17, 18:08
+
Stack 2013-09-17, 18:10
+
Nick Dimiduk 2013-09-17, 20:28