-Re: Setting number of mappers according to number of TextInput lines
Edward Capriolo 2012-06-16, 16:12
No. The number of lines is not known at planning time. All you know is
the size of the blocks. You want to look at mapred.max.split.size .
On Sat, Jun 16, 2012 at 5:31 AM, Ondřej Klimpera <[EMAIL PROTECTED]> wrote:
> I tried this approach, but the job is not distributed among 10 mapper nodes.
> Seems Hadoop ignores this property :(
> My first thought is, that the small file size is the problem and Hadoop
> doesn't care about it's splitting in proper way.
> Thanks any ideas.
> On 06/16/2012 11:27 AM, Bejoy KS wrote:
>> Hi Ondrej
>> You can use NLineInputFormat with n set to 10.
>> ------Original Message------
>> From: Ondřej Klimpera
>> To: [EMAIL PROTECTED]
>> ReplyTo: [EMAIL PROTECTED]
>> Subject: Setting number of mappers according to number of TextInput lines
>> Sent: Jun 16, 2012 14:31
>> I have very small input size (kB), but processing to produce some output
>> takes several minutes. Is there a way how to say, file has 100 lines, i
>> need 10 mappers, where each mapper node has to process 10 lines of input
>> Thanks for advice.
>> Ondrej Klimpera
>> Bejoy KS
>> Sent from handheld, please excuse typos.