Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> OutOfMemoryError in MapReduce Job


+
John 2013-11-01, 13:48
+
Jean-Marc Spaggiari 2013-11-01, 18:36
+
John 2013-11-02, 12:43
Copy link to this message
-
Re: OutOfMemoryError in MapReduce Job
I would try to compress this bit set.

On Nov 2, 2013, at 2:43 PM, John <[EMAIL PROTECTED]> wrote:

> Hi,
>
> thanks for your answer! I increase the "Map Task Maximum Heap Size" to 2gb
> and it seems to work. The OutOfMemoryEroror is gone. But the HBase Region
> server are now crashing all the time :-/ I try to store the bitvector
> (120mb in size) for some rows. This seems to be very memory intensive, the
> usedHeapMB increase very fast (up to 2gb). I'm  not sure if it is the
> reading or the writing task which causes this, but I thnk its the writing
> task. Any idea how to minimize the memory usage? My mapper looks like this:
>
> public class MyMapper extends TableMapper<ImmutableBytesWritable, Put> {
>
> private void storeBitvectorToHBase(
>        Put row = new Put(name);
>        row.setWriteToWAL(false);
>        row.add(cf,    Bytes.toBytes("columname"), toByteArray(bitvector));
>        ImmutableBytesWritable key = new ImmutableBytesWritable(
>                name);
>        context.write(key, row);
> }
> }
>
>
> kind regards
>
>
> 2013/11/1 Jean-Marc Spaggiari <[EMAIL PROTECTED]>
>
>> Ho John,
>>
>> You might be better to ask this on the CDH mailing list since it's more
>> related to Cloudera Manager than HBase.
>>
>> In the meantime, can you try to update the "Map Task Maximum Heap Size"
>> parameter too?
>>
>> JM
>>
>>
>> 2013/11/1 John <[EMAIL PROTECTED]>
>>
>>> Hi,
>>>
>>> I have a problem with the memory. My use case is the following: I've
>> crated
>>> a MapReduce-job and iterate in this over every row. If the row has more
>>> than for example 10k columns I will create a bloomfilter (a bitSet) for
>>> this row and store it in the hbase structure. This worked fine so far.
>>>
>>> BUT, now I try to store a BitSet with 1000000000 elements = ~120mb in
>> size.
>>> In every map()-function there exist 2 BitSet. If i try to execute the
>>> MR-job I got this error: http://pastebin.com/DxFYNuBG
>>>
>>> Obviously, the tasktracker does not have enougth memory. I try to adjust
>>> the configuration for the memory, but I'm not sure which is the right
>> one.
>>> I try to change the "MapReduce Child Java Maximum Heap Size" value from
>> 1GB
>>> to 2GB, but still got the same error.
>>>
>>> Which parameters do I have to adjust? BTW. I'm using CDH 4.4.0 with the
>>> Clouder Manager
>>>
>>> kind regards
>>>
>>
+
John 2013-11-02, 15:29
+
Asaf Mesika 2013-11-02, 16:27
+
John 2013-11-02, 16:46
+
Asaf Mesika 2013-11-03, 19:53
+
John 2013-11-03, 23:12
+
Ted Yu 2013-11-02, 17:53
+
John 2013-11-02, 18:01