Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> java.lang.OutOfMemoryError: Java heap space


Copy link to this message
-
Re: java.lang.OutOfMemoryError: Java heap space
Thanks Park for sharing the above configs

But I am wondering if the above config changes would make any huge
difference in my case.
As per my logs, I am very worried about this line -

 INFO org.apache.hadoop.mapred.MapTask: Record too large for in-memory
buffer: 644245358 bytes

If I am understanding it properly, my 1 record is very large to fit
into the memory, which is causing the issue.
Any of the above changes wouldn't make any huge impact, please correct
me if I am taking it totally wrong.

 - Adding hadoop user group here as well, to throw some valuable
inputs to understand the above question.
Since I am doing a join on a grouped bag, do you think that might be the case ?

But if that is the issue, as far as I understand Bags in Pig are
spillable, it shouldn't have given this issue.

I can't get rid of group by, Grouping by first should idealing improve
my join. But if this is the root cause, if I am understanding it
correctly,

do you think I should get rid of group-by.

But my question in that case would be what would happen if I do group
by later after join, if will result in much bigger bag (because it
would have more records after join)

Am I thinking here correctly ?

Regards

Prav

On Fri, Feb 7, 2014 at 3:11 AM, Cheolsoo Park <[EMAIL PROTECTED]> wrote: