Subject: Re: java.lang.OutOfMemoryError: Java heap space
Thanks Park for sharing the above configs
But I am wondering if the above config changes would make any huge
difference in my case.
As per my logs, I am very worried about this line -
INFO org.apache.hadoop.mapred.MapTask: Record too large for in-memory
buffer: 644245358 bytes
If I am understanding it properly, my 1 record is very large to fit
into the memory, which is causing the issue.
Any of the above changes wouldn't make any huge impact, please correct
me if I am taking it totally wrong.
- Adding hadoop user group here as well, to throw some valuable
inputs to understand the above question.
Since I am doing a join on a grouped bag, do you think that might be the case ?
But if that is the issue, as far as I understand Bags in Pig are
spillable, it shouldn't have given this issue.
I can't get rid of group by, Grouping by first should idealing improve
my join. But if this is the root cause, if I am understanding it
do you think I should get rid of group-by.
But my question in that case would be what would happen if I do group
by later after join, if will result in much bigger bag (because it
would have more records after join)
Am I thinking here correctly ?
On Fri, Feb 7, 2014 at 3:11 AM, Cheolsoo Park <[EMAIL PROTECTED]> wrote: