Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Setting number of mappers according to number of TextInput lines


Copy link to this message
-
Re: Setting number of mappers according to number of TextInput lines
Edward Capriolo 2012-06-16, 16:12
No. The number of lines is not known at planning time. All you know is
the size of the blocks. You want to look at mapred.max.split.size .

On Sat, Jun 16, 2012 at 5:31 AM, Ondřej Klimpera <[EMAIL PROTECTED]> wrote:
> I tried this approach, but the job is not distributed among 10 mapper nodes.
> Seems Hadoop ignores this property :(
>
> My first thought is, that the small file size is the problem and Hadoop
> doesn't care about it's splitting in proper way.
>
> Thanks any ideas.
>
>
> On 06/16/2012 11:27 AM, Bejoy KS wrote:
>>
>> Hi Ondrej
>>
>> You can use NLineInputFormat with n set to 10.
>>
>> ------Original Message------
>> From: Ondřej Klimpera
>> To: [EMAIL PROTECTED]
>> ReplyTo: [EMAIL PROTECTED]
>> Subject: Setting number of mappers according to number of TextInput lines
>> Sent: Jun 16, 2012 14:31
>>
>> Hello,
>>
>> I have very small input size (kB), but processing to produce some output
>> takes several minutes. Is there a way how to say, file has 100 lines, i
>> need 10 mappers, where each mapper node has to process 10 lines of input
>> file?
>>
>> Thanks for advice.
>> Ondrej Klimpera
>>
>>
>> Regards
>> Bejoy KS
>>
>> Sent from handheld, please excuse typos.
>>
>