On 10/15/12 8:47 PM, abhishek dodda wrote:
> hi all,
> I am trying to learn and implement pig optimization rules, Can any one help
> me understanding below properities.
> The amount of memory allocated to bags is determined by
> the default is set to 20% (0.2) of available memory.* Note that this memory
> is shared across all large bags used by the application.
> *Which memory is this ?? 20% which memory is allocated.*
This is 20% of the map/reduce task available memory, ie the jvm maximum
> Which factor to be considered to set number of number of reducers to a
> outer join query with replicated.
> Will increasing number of reducers in a outer join query improve the
> performance ??
Yes, it should increase performance. One thing to watch is for skew
among the reduce runtime. If the reduce runtimes are very skewed, you
might want to consider skew join.