Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - Pig optimization rules


+
abhishek dodda 2012-10-16, 03:47
Copy link to this message
-
Re: Pig optimization rules
Thejas Nair 2012-10-17, 02:04
On 10/15/12 8:47 PM, abhishek dodda wrote:
> hi all,
>
> I am trying to learn and implement pig optimization rules, Can any one help
> me understanding below properities.
>
> The amount of memory allocated to bags is determined by
> *pig.cachedbag.memusage;
> the default is set to 20% (0.2) of available memory.* Note that this memory
> is shared across all large bags used by the application.
>
> *Which memory is this ?? 20% which memory is allocated.*

This is 20% of the map/reduce task available memory, ie the jvm maximum
memory limit.

> Which factor to be considered to set number of number of reducers to a
> outer join query with replicated.
> Will increasing number of reducers in a outer join query improve the
> performance ??
>

Yes, it should increase performance. One thing to watch is for skew
among the reduce runtime. If the reduce runtimes are very skewed, you
might want to consider skew join.

-Thejas
+
abhishek dodda 2012-10-17, 03:15