Indeed, worked like a charm.
Some strange thing that I see is that the more reducers I run, the more
data I get on the output. However, my suspicion is that since I use some
global counters in my reducers, it could be that when it is called the
second time, it overwrites the first results. Oh, well... back to the
drawing board :)
On Mon, Feb 20, 2012 at 11:26 PM, Harsh J <[EMAIL PROTECTED]> wrote:
> The default value for "mapred.reduce.tasks" is indeed "1".
> For your cluster, you should tune your client configuration set to
> carry a suitable number for that property, in mapred-site.xml
> (http://wiki.apache.org/hadoop/HowManyMapsAndReduces might help you
> decide how many), or pass it along as a "-Dmapred.reduce.tasks=Number"
> parameter when you submit a job.
> On Tue, Feb 21, 2012 at 10:34 AM, Mark Kerzner <[EMAIL PROTECTED]>
> > Hi,
> > I used to do
> > job.setNumReduceTasks(1);
> > but I realized that this is bad and commented out this line
> > //job.setNumReduceTasks(1);
> > I still see the number of reduce tasks as 1 when my mappers number 4. Why
> > could this be?
> > Thank you,
> > Mark
> Harsh J
> Customer Ops. Engineer
> Cloudera | http://tiny.cloudera.com/about