Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> different mapred.min.split.size within one pig script?


Copy link to this message
-
Re: different mapred.min.split.size within one pig script?
thanks,

I tried, but it does not seem to work,  even after I put the second set
split.size= at the very end of the script,
it is the second SET that takes effect for both places i used the SET.

Yang

On Tue, Jun 12, 2012 at 3:56 PM, Alex Rovner <[EMAIL PROTECTED]> wrote:

> Yes. Use the "set" keyword right before the operation that needs this
> setting. Since pig will optimize certain statements and collapse them into
> a single job, you would have to move your statement up a couple
> instructions in order for it to take effect.
>
> Sent from my iPhone
>
> On Jun 10, 2012, at 10:06 PM, Yang <[EMAIL PROTECTED]> wrote:
>
> > I need to set mapred.min.split.size for one part of my pig script
> > because the mapper job corresponding to the first part of the script
> takes
> > much longer time per input record than other parts of the script.
> >
> > so I have to set the split size very small to take care of that
> particular
> > script,
> >
> > but then later parts of the script also used this value and used too many
> > splits,
> >
> > is it possible to set min.split.size value to different values within the
> > same script?
> >
> > Thanks
> > Yang
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB