Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> The dreaded Heap Space Issue on a Transform

Copy link to this message
Re: The dreaded Heap Space Issue on a Transform
I am realizing one of my challenges is that I have quite a few cores and
map tasks per node, but (I didn't set it up) I am only running 4 GB per
physical core (12) with 18 map slots.  I am guessing right now that any
given time, with 18 map slots, the 1.8 total GB of ram I am assigning to to
the sort stuff is under sized, yet I am constrained on memory, so I can't
just up it. Working on getting things upgraded. Thanks for all I appreciate
the thoughts.

On Wed, Jan 30, 2013 at 10:40 AM, Dean Wampler <

> We didn't ask yet, but to be sure, are all the slave nodes configured the
> same, both in terms of hardware and other apps running, if any, running on
> them?
> On Wed, Jan 30, 2013 at 10:14 AM, Richard Nadeau <[EMAIL PROTECTED]>wrote:
>> What do you have set in core-site.XML for io.sort.mb, io.sort.factor, and
>> io.file.buffer.size? You should be able to adjust these and get past the
>> heap issue. Be careful about how much ram you ave though, and don't st them
>> too high.
>> Rick
>> On Jan 30, 2013 8:55 AM, "John Omernik" <[EMAIL PROTECTED]> wrote:
>>> So it's filling up on the emitting stage, so I need to look at the task
>>> logs and or my script that's printing to stdout as the likely culprits I am
>>> guessing.
>>> On Wed, Jan 30, 2013 at 9:11 AM, Philip Tromans <
>>> [EMAIL PROTECTED]> wrote:
>>>> That particular OutOfMemoryError is happening on one of your hadoop
>>>> nodes. It's the heap within the process forked by the hadoop tasktracker, I
>>>> think.
>>>> Phil.
>>>> On 30 January 2013 14:28, John Omernik <[EMAIL PROTECTED]> wrote:
>>>>> So just a follow-up. I am less looking for specific troubleshooting on
>>>>> how to fix my problem, and more looking for a general understanding of heap
>>>>> space usage with Hive.  When I get an error like this, is it heap space on
>>>>> a node, or heap space on my hive server?  Is it the heap space of the
>>>>> tasktracker? Heap of the job kicked off on the node?  Which heap is being
>>>>> affected? If it's not clear in my output, where can I better understand
>>>>> this? I am sorely out of my league here when it comes to understanding the
>>>>> JVM interactions of Hive and Hadoop, i.e. where hive is run, vs where task
>>>>> trackers are run etc.
>>>>> Thanks is advance!
>>>>> On Tue, Jan 29, 2013 at 7:43 AM, John Omernik <[EMAIL PROTECTED]>wrote:
>>>>>> I am running a transform script that parses through a bunch of binary
>>>>>> data. In 99% of the cases it runs, it runs fine, but on certain files I get
>>>>>> a failure (as seen below).  Funny thing is, I can run a job with "only" the
>>>>>> problem source file, and it will work fine, but when as a group of files, I
>>>>>> get these warnings.  I guess what I am asking here is this: Where is the
>>>>>> heap error? Is this occurring on the nodes themselves or, since this is
>>>>>> where the script is emitting records (and potentially large ones at that)
>>>>>> and in this case my hive server running the job may be memory light, could
>>>>>> the issue actually be due to heap on the hive server itself?   My setup is
>>>>>> 1 Hive node (that is woefully underpowered, under memoried, and under disk
>>>>>> I/Oed) and 4 beefy hadoop nodes.  I guess, my question is the heap issue on
>>>>>> the sender or the receiver :)
>>>>>> 13-01-29 08:20:24,107 INFO org.apache.hadoop.hive.ql.io.CodecPool:
>>>>>> Got brand-new compressor
>>>>>> 2013-01-29 08:20:24,107 INFO
>>>>>> org.apache.hadoop.hive.ql.exec.SelectOperator: 12 forwarding 1 rows
>>>>>> 2013-01-29 08:20:24,410 INFO
>>>>>> org.apache.hadoop.hive.ql.exec.ScriptOperator: 3 forwarding 10 rows
>>>>>> 2013-01-29 08:20:24,410 INFO
>>>>>> org.apache.hadoop.hive.ql.exec.SelectOperator: 4 forwarding 10 rows
>>>>>> 2013-01-29 08:20:24,411 INFO
>>>>>> org.apache.hadoop.hive.ql.exec.SelectOperator: 5 forwarding 10 rows
>>>>>> 2013-01-29 08:20:24,411 INFO