Thamizhannal Paramasivam 2012-02-16, 06:40
Joey Echeverria 2012-02-16, 13:26
Thamizhannal Paramasivam 2012-02-16, 16:03
If your input comprises of text files then changing the input format to TextInputFormat can get things right. One mapper for each hdfs block.
Bejoy K S
From handheld, Please excuse typos.
From: Thamizhannal Paramasivam <[EMAIL PROTECTED]>
Date: Thu, 16 Feb 2012 21:33:11
To: <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
Subject: Re: num of reducer
Here are the input format for mapper.
Input Format: MultiFileInputFormat
MapperOutputKey : Text
I shall not be in the position to upgrade hadoop-0.19.2 for some reason.
I have checked in number of mapper on job-tracker.
On Thu, Feb 16, 2012 at 6:56 PM, Joey Echeverria <[EMAIL PROTECTED]> wrote:
> Hi Tamil,
> I'd recommend upgrading to a newer release as 0.19.2 is very old. As for
> your question, most input formats should set the number mappers correctly.
> What input format are you using? Where did you see the number of tasks it
> assigned to the job?
> On Thu, Feb 16, 2012 at 1:40 AM, Thamizhannal Paramasivam <
> [EMAIL PROTECTED]> wrote:
>> Hi All,
>> I am using hadoop-0.19.2 and running a Mapper only Job on cluster. It's
>> input path has >1000 files of 100-200MB. Since, it is Mapper only job, I
>> gave number Of reducer=0. So, it is using 2 mapper to run all the input
>> files. If we did not state the number of mapper, would n't it pick the 1
>> mapper per input file? Or Does the default won't it pick a fair num of
>> mapper according to number input file?
> Joseph Echeverria
> Cloudera, Inc.
Joey Echeverria 2012-02-16, 16:36
Thamizhannal Paramasivam 2012-02-17, 04:56
Bejoy Ks 2012-02-17, 09:38
Thamizhannal Paramasivam 2012-02-17, 17:37