Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Pig optimization rules


Copy link to this message
-
Re: Pig optimization rules
On 10/15/12 8:47 PM, abhishek dodda wrote:
> hi all,
>
> I am trying to learn and implement pig optimization rules, Can any one help
> me understanding below properities.
>
> The amount of memory allocated to bags is determined by
> *pig.cachedbag.memusage;
> the default is set to 20% (0.2) of available memory.* Note that this memory
> is shared across all large bags used by the application.
>
> *Which memory is this ?? 20% which memory is allocated.*

This is 20% of the map/reduce task available memory, ie the jvm maximum
memory limit.

> Which factor to be considered to set number of number of reducers to a
> outer join query with replicated.
> Will increasing number of reducers in a outer join query improve the
> performance ??
>

Yes, it should increase performance. One thing to watch is for skew
among the reduce runtime. If the reduce runtimes are very skewed, you
might want to consider skew join.

-Thejas