Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Pig optimization rules

abhishek dodda 2012-10-16, 03:47
Copy link to this message
Re: Pig optimization rules
On 10/15/12 8:47 PM, abhishek dodda wrote:
> hi all,
> I am trying to learn and implement pig optimization rules, Can any one help
> me understanding below properities.
> The amount of memory allocated to bags is determined by
> *pig.cachedbag.memusage;
> the default is set to 20% (0.2) of available memory.* Note that this memory
> is shared across all large bags used by the application.
> *Which memory is this ?? 20% which memory is allocated.*

This is 20% of the map/reduce task available memory, ie the jvm maximum
memory limit.

> Which factor to be considered to set number of number of reducers to a
> outer join query with replicated.
> Will increasing number of reducers in a outer join query improve the
> performance ??

Yes, it should increase performance. One thing to watch is for skew
among the reduce runtime. If the reduce runtimes are very skewed, you
might want to consider skew join.

abhishek dodda 2012-10-17, 03:15