Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Map‘s number with NLineInputFormat


Copy link to this message
-
Map‘s number with NLineInputFormat
YouPeng Yang 2013-04-19, 16:30
Hi All

 I  take NLineInputFormat  as the Text Input Format with the following code
:
 NLineInputFormat.setNumLinesPerSplit(job, 10);
 NLineInputFormat.addInputPath(job,new Path(args[0].toString()));

 My input file contains 1000 rows,so I thought it will distribute
100(1000/10) maps.However I got 4 maps.

  I'm confued by the number of Map that was distributed according to the
running log[1].
 How it distribute  maps when using NLineInputFormat
Regards

[1]======================================================....
....
2013-04-19 23:56:20,377 INFO  mapreduce.Job
(Job.java:monitorAndPrintJob(1286)) - Job job_local_0001 running in uber
mode : false
2013-04-19 23:56:20,377 INFO  mapreduce.Job
(Job.java:monitorAndPrintJob(1293)) -  map 25% reduce 0%
2013-04-19 23:56:20,381 INFO  mapred.MapTask
(MapTask.java:sortAndSpill(1597)) - Finished spill 0
2013-04-19 23:56:20,384 INFO  mapred.Task (Task.java:done(979)) -
Task:attempt_local_0001_m_000001_0 is done. And is in the process of
committing
2013-04-19 23:56:20,388 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:statusUpdate(501)) - map
2013-04-19 23:56:20,389 INFO  mapred.Task (Task.java:sendDone(1099)) - Task
'attempt_local_0001_m_000001_0' done.
2013-04-19 23:56:20,389 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:run(238)) - Finishing task:
attempt_local_0001_m_000001_0
2013-04-19 23:56:20,389 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:run(213)) - Starting task:
attempt_local_0001_m_000002_0
2013-04-19 23:56:20,391 INFO  mapred.Task (Task.java:initialize(565)) -
 Using ResourceCalculatorPlugin :
org.apache.hadoop.yarn.util.LinuxResourceCalculatorPlugin@36bf7916
2013-04-19 23:56:20,486 INFO  mapred.MapTask
(MapTask.java:setEquator(1127)) - (EQUATOR) 0 kvi 26214396(104857584)
2013-04-19 23:56:20,486 INFO  mapred.MapTask (MapTask.java:<init>(923)) -
mapreduce.task.io.sort.mb: 100
2013-04-19 23:56:20,486 INFO  mapred.MapTask (MapTask.java:<init>(924)) -
soft limit at 83886080
2013-04-19 23:56:20,486 INFO  mapred.MapTask (MapTask.java:<init>(925)) -
bufstart = 0; bufvoid = 104857600
2013-04-19 23:56:20,487 INFO  mapred.MapTask (MapTask.java:<init>(926)) -
kvstart = 26214396; length = 6553600
2013-04-19 23:56:20,515 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:statusUpdate(501)) -
2013-04-19 23:56:20,515 INFO  mapred.MapTask (MapTask.java:flush(1389)) -
Starting flush of map output
2013-04-19 23:56:20,516 INFO  mapred.MapTask (MapTask.java:flush(1408)) -
Spilling map output
2013-04-19 23:56:20,516 INFO  mapred.MapTask (MapTask.java:flush(1409)) -
bufstart = 0; bufend = 336; bufvoid = 104857600
2013-04-19 23:56:20,516 INFO  mapred.MapTask (MapTask.java:flush(1411)) -
kvstart = 26214396(104857584); kvend = 26214208(104856832); length 189/6553600
2013-04-19 23:56:20,523 INFO  mapred.MapTask
(MapTask.java:sortAndSpill(1597)) - Finished spill 0
2013-04-19 23:56:20,552 INFO  mapred.Task (Task.java:done(979)) -
Task:attempt_local_0001_m_000002_0 is done. And is in the process of
committing
2013-04-19 23:56:20,555 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:statusUpdate(501)) - map
2013-04-19 23:56:20,556 INFO  mapred.Task (Task.java:sendDone(1099)) - Task
'attempt_local_0001_m_000002_0' done.
2013-04-19 23:56:20,556 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:run(238)) - Finishing task:
attempt_local_0001_m_000002_0
2013-04-19 23:56:20,556 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:run(213)) - Starting task:
attempt_local_0001_m_000003_0
2013-04-19 23:56:20,558 INFO  mapred.Task (Task.java:initialize(565)) -
 Using ResourceCalculatorPlugin :
org.apache.hadoop.yarn.util.LinuxResourceCalculatorPlugin@746a63d3
2013-04-19 23:56:20,666 INFO  mapred.MapTask
(MapTask.java:setEquator(1127)) - (EQUATOR) 0 kvi 26214396(104857584)
2013-04-19 23:56:20,666 INFO  mapred.MapTask (MapTask.java:<init>(923)) -
mapreduce.task.io.sort.mb: 100
2013-04-19 23:56:20,666 INFO  mapred.MapTask (MapTask.java:<init>(924)) -
soft limit at 83886080
2013-04-19 23:56:20,666 INFO  mapred.MapTask (MapTask.java:<init>(925)) -
bufstart = 0; bufvoid = 104857600
2013-04-19 23:56:20,667 INFO  mapred.MapTask (MapTask.java:<init>(926)) -
kvstart = 26214396; length = 6553600
2013-04-19 23:56:20,690 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:statusUpdate(501)) -
2013-04-19 23:56:20,690 INFO  mapred.MapTask (MapTask.java:flush(1389)) -
Starting flush of map output
2013-04-19 23:56:20,690 INFO  mapred.MapTask (MapTask.java:flush(1408)) -
Spilling map output
2013-04-19 23:56:20,690 INFO  mapred.MapTask (MapTask.java:flush(1409)) -
bufstart = 0; bufend = 329; bufvoid = 104857600
2013-04-19 23:56:20,690 INFO  mapred.MapTask (MapTask.java:flush(1411)) -
kvstart = 26214396(104857584); kvend = 26214212(104856848); length 185/6553600
2013-04-19 23:56:20,695 INFO  mapred.MapTask
(MapTask.java:sortAndSpill(1597)) - Finished spill 0
2013-04-19 23:56:20,697 INFO  mapred.Task (Task.java:done(979)) -
Task:attempt_local_0001_m_000003_0 is done. And is in the process of
committing
2013-04-19 23:56:20,717 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:statusUpdate(501)) - map
2013-04-19 23:56:20,718 INFO  mapred.Task (Task.java:sendDone(1099)) - Task
'attempt_local_0001_m_000003_0' done.
2013-04-19 23:56:20,718 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:run(238)) - Finishing task:
attempt_local_0001_m_000003_0
2013-04-19 23:56:20,718 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:run(394)) - Map task executor complete.
2013-04-19 23:56:20,752 INFO  mapred.Task (Task.java:initialize(565)) -
 Using ResourceCalculatorPlugin :
org.apache.hadoop.yarn.util.LinuxResourceCalculatorPlugin@52cd19d
2013-04-19 23:56:20,760 INFO  mapred.Merger (Merger.java:merge(549)) -
Merging 4 sorted segments
2013-04-19 23:56:20,767 INFO  mapred.Merger (Merger.java:merge(648)) - Down
to the last merge-pass, with 4 segments left of total size: 8532 bytes
2013-04-19 23:56:20,768 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:statusUpdate(501)) -
2013-04-19 23:56:20,807 WARN  conf.Configurat