Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Map‘s number with NLineInputFormat


Copy link to this message
-
Re: Map‘s number with NLineInputFormat
yypvsxf19870706 2013-04-20, 01:18
Hi
   I thought it would be different when adopt the NLineInputFormat
   So here is my conclusion the maps distribution has nothing with the  
NLineInputFormat . The
NLineInputFormat could decide the number of row to each map, which map has been generated according to the split.size .

    An I got the point?
Regards

发自我的 iPhone

在 2013-4-20,8:39,"姚吉龙" <[EMAIL PROTECTED]> 写道:

> The num of map is decided by the block size and your rawdata
>
> ―
> Sent from Mailbox for iPhone
>
>
> On Sat, Apr 20, 2013 at 12:30 AM, YouPeng Yang <[EMAIL PROTECTED]> wrote:
>
>> Hi All
>>    
>>  I  take NLineInputFormat  as the Text Input Format with the following code :
>>  NLineInputFormat.setNumLinesPerSplit(job, 10);
>>  NLineInputFormat.addInputPath(job,new Path(args[0].toString()));
>>
>>  My input file contains 1000 rows,so I thought it will distribute 100(1000/10) maps.However I got 4 maps.
>>
>>   I'm confued by the number of Map that was distributed according to the running log[1].
>>  How it distribute  maps when using NLineInputFormat
>>
>>
>> Regards
>>
>>
>>
>> [1]======================================================>> ....
>> ....
>> 2013-04-19 23:56:20,377 INFO  mapreduce.Job (Job.java:monitorAndPrintJob(1286)) - Job job_local_0001 running in uber mode : false
>> 2013-04-19 23:56:20,377 INFO  mapreduce.Job (Job.java:monitorAndPrintJob(1293)) -  map 25% reduce 0%
>> 2013-04-19 23:56:20,381 INFO  mapred.MapTask (MapTask.java:sortAndSpill(1597)) - Finished spill 0
>> 2013-04-19 23:56:20,384 INFO  mapred.Task (Task.java:done(979)) - Task:attempt_local_0001_m_000001_0 is done. And is in the process of committing
>> 2013-04-19 23:56:20,388 INFO  mapred.LocalJobRunner (LocalJobRunner.java:statusUpdate(501)) - map
>> 2013-04-19 23:56:20,389 INFO  mapred.Task (Task.java:sendDone(1099)) - Task 'attempt_local_0001_m_000001_0' done.
>> 2013-04-19 23:56:20,389 INFO  mapred.LocalJobRunner (LocalJobRunner.java:run(238)) - Finishing task: attempt_local_0001_m_000001_0
>> 2013-04-19 23:56:20,389 INFO  mapred.LocalJobRunner (LocalJobRunner.java:run(213)) - Starting task: attempt_local_0001_m_000002_0
>> 2013-04-19 23:56:20,391 INFO  mapred.Task (Task.java:initialize(565)) -  Using ResourceCalculatorPlugin : org.apache.hadoop.yarn.util.LinuxResourceCalculatorPlugin@36bf7916
>> 2013-04-19 23:56:20,486 INFO  mapred.MapTask (MapTask.java:setEquator(1127)) - (EQUATOR) 0 kvi 26214396(104857584)
>> 2013-04-19 23:56:20,486 INFO  mapred.MapTask (MapTask.java:<init>(923)) - mapreduce.task.io.sort.mb: 100
>> 2013-04-19 23:56:20,486 INFO  mapred.MapTask (MapTask.java:<init>(924)) - soft limit at 83886080
>> 2013-04-19 23:56:20,486 INFO  mapred.MapTask (MapTask.java:<init>(925)) - bufstart = 0; bufvoid = 104857600
>> 2013-04-19 23:56:20,487 INFO  mapred.MapTask (MapTask.java:<init>(926)) - kvstart = 26214396; length = 6553600
>> 2013-04-19 23:56:20,515 INFO  mapred.LocalJobRunner (LocalJobRunner.java:statusUpdate(501)) -
>> 2013-04-19 23:56:20,515 INFO  mapred.MapTask (MapTask.java:flush(1389)) - Starting flush of map output
>> 2013-04-19 23:56:20,516 INFO  mapred.MapTask (MapTask.java:flush(1408)) - Spilling map output
>> 2013-04-19 23:56:20,516 INFO  mapred.MapTask (MapTask.java:flush(1409)) - bufstart = 0; bufend = 336; bufvoid = 104857600
>> 2013-04-19 23:56:20,516 INFO  mapred.MapTask (MapTask.java:flush(1411)) - kvstart = 26214396(104857584); kvend = 26214208(104856832); length = 189/6553600
>> 2013-04-19 23:56:20,523 INFO  mapred.MapTask (MapTask.java:sortAndSpill(1597)) - Finished spill 0
>> 2013-04-19 23:56:20,552 INFO  mapred.Task (Task.java:done(979)) - Task:attempt_local_0001_m_000002_0 is done. And is in the process of committing
>> 2013-04-19 23:56:20,555 INFO  mapred.LocalJobRunner (LocalJobRunner.java:statusUpdate(501)) - map
>> 2013-04-19 23:56:20,556 INFO  mapred.Task (Task.java:sendDone(1099)) - Task 'attempt_local_0001_m_000002_0' done.
>> 2013-04-19 23:56:20,556 INFO  mapred.LocalJobRunner (LocalJobRunner.java:run(238)) - Finishing task: attempt_local_0001_m_000002_0