Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Pig optimization rules


Copy link to this message
-
Re: Pig optimization rules
On 10/15/12 8:47 PM, abhishek dodda wrote:
> hi all,
>
> I am trying to learn and implement pig optimization rules, Can any one help
> me understanding below properities.
>
> The amount of memory allocated to bags is determined by
> *pig.cachedbag.memusage;
> the default is set to 20% (0.2) of available memory.* Note that this memory
> is shared across all large bags used by the application.
>
> *Which memory is this ?? 20% which memory is allocated.*

This is 20% of the map/reduce task available memory, ie the jvm maximum
memory limit.

> Which factor to be considered to set number of number of reducers to a
> outer join query with replicated.
> Will increasing number of reducers in a outer join query improve the
> performance ??
>

Yes, it should increase performance. One thing to watch is for skew
among the reduce runtime. If the reduce runtimes are very skewed, you
might want to consider skew join.

-Thejas
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB