Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Setting number of mappers according to number of TextInput lines


Copy link to this message
-
Re: Setting number of mappers according to number of TextInput lines
No. The number of lines is not known at planning time. All you know is
the size of the blocks. You want to look at mapred.max.split.size .

On Sat, Jun 16, 2012 at 5:31 AM, Ondřej Klimpera <[EMAIL PROTECTED]> wrote:
> I tried this approach, but the job is not distributed among 10 mapper nodes.
> Seems Hadoop ignores this property :(
>
> My first thought is, that the small file size is the problem and Hadoop
> doesn't care about it's splitting in proper way.
>
> Thanks any ideas.
>
>
> On 06/16/2012 11:27 AM, Bejoy KS wrote:
>>
>> Hi Ondrej
>>
>> You can use NLineInputFormat with n set to 10.
>>
>> ------Original Message------
>> From: Ondřej Klimpera
>> To: [EMAIL PROTECTED]
>> ReplyTo: [EMAIL PROTECTED]
>> Subject: Setting number of mappers according to number of TextInput lines
>> Sent: Jun 16, 2012 14:31
>>
>> Hello,
>>
>> I have very small input size (kB), but processing to produce some output
>> takes several minutes. Is there a way how to say, file has 100 lines, i
>> need 10 mappers, where each mapper node has to process 10 lines of input
>> file?
>>
>> Thanks for advice.
>> Ondrej Klimpera
>>
>>
>> Regards
>> Bejoy KS
>>
>> Sent from handheld, please excuse typos.
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB