Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> [VOTE] The 1st hbase-0.96.0 release candidate is available for download


Copy link to this message
-
Re: [VOTE] The 1st hbase-0.96.0 release candidate is available for download
Yeah I got that for a while as well.  Though once online schema change
was disabled I can complete the reduce step with 2.5 G of heap (still
failing :-/).

On Tue, Sep 3, 2013 at 6:30 PM, Devaraj Das <[EMAIL PROTECTED]> wrote:
> Well from the test code it seems like the problem is due to the fact that
> the reducer got unexpected data and it was trying to construct the log
> message for the user. So the job had already failed in reality.
>
>
> On Tue, Sep 3, 2013 at 6:17 PM, Devaraj Das <[EMAIL PROTECTED]> wrote:
>
>> Elliott, what are the heap sizes of the M/R tasks in your setup. I was
>> running the job like this (without chaosmonkey to start with):
>>
>> hbase org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList Loop 5 12
>> 2500000 IntegrationTestBigLinkedList 10
>>
>> Even the above test failed with one reduce task failing with OOM, in the
>> verify step. The heap size was set to 3G.
>>
>> 2013-09-04 01:11:56,054 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: Java heap space
>>       at java.util.Arrays.copyOf(Arrays.java:2882)
>>       at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
>>       at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)
>>       at java.lang.StringBuilder.append(StringBuilder.java:119)
>>       at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Verify$VerifyReducer.reduce(IntegrationTestBigLinkedList.java:576)
>>       at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Verify$VerifyReducer.reduce(IntegrationTestBigLinkedList.java:547)
>>       at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
>>       at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:645)
>>       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:405)
>>       at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>       at java.security.AccessController.doPrivileged(Native Method)
>>       at javax.security.auth.Subject.doAs(Subject.java:396)
>>       at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477)
>>       at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>>
>>
>>
>>
>>
>> On Tue, Sep 3, 2013 at 12:56 PM, Elliott Clark <[EMAIL PROTECTED]> wrote:
>>
>>> Can someone take a look at running Test Big Linked for > 5 iterations
>>> with slowDeterministic chaos monkey on a distributed cluster.  I'm
>>> pretty concerned about HBASE-9338
>>>
>>> On Tue, Sep 3, 2013 at 6:57 AM, Jean-Marc Spaggiari
>>> <[EMAIL PROTECTED]> wrote:
>>> > There was a typo in my log4j.properties :(
>>> >
>>> > So it's working fine.
>>> >
>>> > The only INFO logs I still see are those one:
>>> > 2013-09-03 09:53:07,313 INFO  [M:0;t430s:45176] mortbay.log: Logging to
>>> > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>> > org.mortbay.log.Slf4jLog
>>> > 2013-09-03 09:53:07,350 INFO  [M:0;t430s:45176] mortbay.log:
>>> jetty-6.1.26
>>> > But there is only very few of them.
>>> >
>>> > Performances wise, here are the numbers (the higher, the better. Rows
>>> per
>>> > seconds, expect for scans where it's rows/min). As you will see, 0.96 is
>>> > slower only for RandomSeekScanTest (way slower) and
>>> RandomScanWithRange10
>>> > but is faster for everything else. I ran the tests with the default
>>> > settings. So I think we should look at RandomSeekScanTest but expect
>>> this
>>> > one, everything else is pretty good.
>>> >
>>> > Also, I have been able to reproduce this exception:
>>> > 2013-09-03 09:55:36,718 WARN  [NIOServerCxn.Factory:
>>> 0.0.0.0/0.0.0.0:2181]
>>> > server.NIOServerCnxn: caught end of stream exception
>>> > EndOfStreamException: Unable to read additional data from client
>>> sessionid
>>> > 0x140e4191edb0009, likely client has closed socket
>>> >     at
>>> > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
>>> >     at
>>> >
>>> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)