Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS, mail # user - Re: Map‘s number with NLineInputFormat


Copy link to this message
-
Re: Map‘s number with NLineInputFormat
Harsh J 2013-04-20, 17:04
Do you also ensure setting your desired input format class via the
setInputFormat*(…) API?

On Sat, Apr 20, 2013 at 6:48 AM, yypvsxf19870706
<[EMAIL PROTECTED]> wrote:
> Hi
>    I thought it would be different when adopt the NLineInputFormat
>    So here is my conclusion the maps distribution has nothing with the
> NLineInputFormat . The
> NLineInputFormat could decide the number of row to each map, which map has
> been generated according to the split.size .
>
>     An I got the point?
>
>
> Regards
>
> 发自我的 iPhone
>
> 在 2013-4-20,8:39,"姚吉龙" <[EMAIL PROTECTED]> 写道:
>
> The num of map is decided by the block size and your rawdata
>
> ―
> Sent from Mailbox for iPhone
>
>
> On Sat, Apr 20, 2013 at 12:30 AM, YouPeng Yang <[EMAIL PROTECTED]>
> wrote:
>>
>> Hi All
>>
>>  I  take NLineInputFormat  as the Text Input Format with the following
>> code :
>>  NLineInputFormat.setNumLinesPerSplit(job, 10);
>>  NLineInputFormat.addInputPath(job,new Path(args[0].toString()));
>>
>>  My input file contains 1000 rows,so I thought it will distribute
>> 100(1000/10) maps.However I got 4 maps.
>>
>>   I'm confued by the number of Map that was distributed according to the
>> running log[1].
>>  How it distribute  maps when using NLineInputFormat
>>
>>
>> Regards
>>
>>
>>
>> [1]======================================================>> ....
>> ....
>> 2013-04-19 23:56:20,377 INFO  mapreduce.Job
>> (Job.java:monitorAndPrintJob(1286)) - Job job_local_0001 running in uber
>> mode : false
>> 2013-04-19 23:56:20,377 INFO  mapreduce.Job
>> (Job.java:monitorAndPrintJob(1293)) -  map 25% reduce 0%
>> 2013-04-19 23:56:20,381 INFO  mapred.MapTask
>> (MapTask.java:sortAndSpill(1597)) - Finished spill 0
>> 2013-04-19 23:56:20,384 INFO  mapred.Task (Task.java:done(979)) -
>> Task:attempt_local_0001_m_000001_0 is done. And is in the process of
>> committing
>> 2013-04-19 23:56:20,388 INFO  mapred.LocalJobRunner
>> (LocalJobRunner.java:statusUpdate(501)) - map
>> 2013-04-19 23:56:20,389 INFO  mapred.Task (Task.java:sendDone(1099)) -
>> Task 'attempt_local_0001_m_000001_0' done.
>> 2013-04-19 23:56:20,389 INFO  mapred.LocalJobRunner
>> (LocalJobRunner.java:run(238)) - Finishing task:
>> attempt_local_0001_m_000001_0
>> 2013-04-19 23:56:20,389 INFO  mapred.LocalJobRunner
>> (LocalJobRunner.java:run(213)) - Starting task:
>> attempt_local_0001_m_000002_0
>> 2013-04-19 23:56:20,391 INFO  mapred.Task (Task.java:initialize(565)) -
>> Using ResourceCalculatorPlugin :
>> org.apache.hadoop.yarn.util.LinuxResourceCalculatorPlugin@36bf7916
>> 2013-04-19 23:56:20,486 INFO  mapred.MapTask
>> (MapTask.java:setEquator(1127)) - (EQUATOR) 0 kvi 26214396(104857584)
>> 2013-04-19 23:56:20,486 INFO  mapred.MapTask (MapTask.java:<init>(923)) -
>> mapreduce.task.io.sort.mb: 100
>> 2013-04-19 23:56:20,486 INFO  mapred.MapTask (MapTask.java:<init>(924)) -
>> soft limit at 83886080
>> 2013-04-19 23:56:20,486 INFO  mapred.MapTask (MapTask.java:<init>(925)) -
>> bufstart = 0; bufvoid = 104857600
>> 2013-04-19 23:56:20,487 INFO  mapred.MapTask (MapTask.java:<init>(926)) -
>> kvstart = 26214396; length = 6553600
>> 2013-04-19 23:56:20,515 INFO  mapred.LocalJobRunner
>> (LocalJobRunner.java:statusUpdate(501)) -
>> 2013-04-19 23:56:20,515 INFO  mapred.MapTask (MapTask.java:flush(1389)) -
>> Starting flush of map output
>> 2013-04-19 23:56:20,516 INFO  mapred.MapTask (MapTask.java:flush(1408)) -
>> Spilling map output
>> 2013-04-19 23:56:20,516 INFO  mapred.MapTask (MapTask.java:flush(1409)) -
>> bufstart = 0; bufend = 336; bufvoid = 104857600
>> 2013-04-19 23:56:20,516 INFO  mapred.MapTask (MapTask.java:flush(1411)) -
>> kvstart = 26214396(104857584); kvend = 26214208(104856832); length >> 189/6553600
>> 2013-04-19 23:56:20,523 INFO  mapred.MapTask
>> (MapTask.java:sortAndSpill(1597)) - Finished spill 0
>> 2013-04-19 23:56:20,552 INFO  mapred.Task (Task.java:done(979)) -
>> Task:attempt_local_0001_m_000002_0 is done. And is in the process of
>> committing
>> 2013-04-19 23:56:20,555 INFO  mapred.LocalJobRunner

Harsh J