Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Reduce Tasks


+
Mohit Anchlia 2013-02-01, 22:42
+
Harsha 2013-02-01, 22:44
+
Mohit Anchlia 2013-02-01, 22:54
+
Harsha 2013-02-01, 23:15
+
Mohit Anchlia 2013-02-02, 00:53
+
Alan Gates 2013-02-02, 01:04
Copy link to this message
-
Re: Reduce Tasks
Sorry my question was around mapred.map.tasks I mistakenly specified wrong
parameter. In pig I am setting mapred.map.tasks to 200 but there are more
tasks being executed.

On Fri, Feb 1, 2013 at 5:04 PM, Alan Gates <[EMAIL PROTECTED]> wrote:

> Setting that mapred.reduce.tasks won't work as Pig overrides.  See
> http://pig.apache.org/docs/r0.10.0/perf.html#parallel for info on how to
> set the number of reducers in Pig.
>
> Alan.
>
> On Feb 1, 2013, at 4:53 PM, Mohit Anchlia wrote:
>
> > Just slightly different problem I tried setting SET mapred.reduce.tasks
> to
> > 200 in pig but still more tasks were launched for that job. Is there any
> > other way to set the parameter?
> >
> > On Fri, Feb 1, 2013 at 3:15 PM, Harsha <[EMAIL PROTECTED]> wrote:
> >
> >>
> >> its the total number of reducers not active reducers.
> >> If you specify lower number  each reducer gets more data to process.
> >> --
> >> Harsha
> >>
> >>
> >> On Friday, February 1, 2013 at 2:54 PM, Mohit Anchlia wrote:
> >>
> >>> Thanks! Is there a downside of reducing number of reducers? I am trying
> >> to
> >>> alleviate high CPU.
> >>>
> >>> With low reducers using parallel clause does it mean that more data is
> >>> processed by each reducer or does it mean how many reducers can be
> active
> >>> at one time
> >>>
> >>> On Fri, Feb 1, 2013 at 2:44 PM, Harsha <[EMAIL PROTECTED] (mailto:
> >> [EMAIL PROTECTED])> wrote:
> >>>
> >>>> Mohit,
> >>>> you can use PARALLEL clause to specify reduce tasks. More info here
> >>>>
> >>
> http://pig.apache.org/docs/r0.8.1/cookbook.html#Use+the+Parallel+Features
> >>>>
> >>>> --
> >>>> Harsha
> >>>>
> >>>>
> >>>> On Friday, February 1, 2013 at 2:42 PM, Mohit Anchlia wrote:
> >>>>
> >>>>> Is there a way to specify max number of reduce tasks that a job
> >> should
> >>>> span
> >>>>> in pig script without having to restart the cluster?
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>
> >>
> >>
>
>
+
Rohini Palaniswamy 2013-02-06, 21:30