Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - different mapred.min.split.size within one pig script?


Copy link to this message
-
Re: different mapred.min.split.size within one pig script?
Yang 2012-06-14, 06:08
thanks,

I tried, but it does not seem to work,  even after I put the second set
split.size= at the very end of the script,
it is the second SET that takes effect for both places i used the SET.

Yang

On Tue, Jun 12, 2012 at 3:56 PM, Alex Rovner <[EMAIL PROTECTED]> wrote:

> Yes. Use the "set" keyword right before the operation that needs this
> setting. Since pig will optimize certain statements and collapse them into
> a single job, you would have to move your statement up a couple
> instructions in order for it to take effect.
>
> Sent from my iPhone
>
> On Jun 10, 2012, at 10:06 PM, Yang <[EMAIL PROTECTED]> wrote:
>
> > I need to set mapred.min.split.size for one part of my pig script
> > because the mapper job corresponding to the first part of the script
> takes
> > much longer time per input record than other parts of the script.
> >
> > so I have to set the split size very small to take care of that
> particular
> > script,
> >
> > but then later parts of the script also used this value and used too many
> > splits,
> >
> > is it possible to set min.split.size value to different values within the
> > same script?
> >
> > Thanks
> > Yang
>