Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> werid spilled records


Copy link to this message
-
werid spilled records
Hi All.

 Things werid that I found records are spilled into the disk [1] while the
map input split bytes is samll[2].
  Also the value of the mapreduce.task.io.sort.mb  is 100mb by default.

  I do think that there should not be any records spilled to the disks.And
the memory is sufficient  to contain the outputs of maps.
 So how does the spilled could happen?
 Any help will be appretiated.

Thanks.
  The version : CDH4.1.2 with MPv2 & Yarn.

2013-04-01 20:13:50,289 INFO  mapreduce.Job
(Job.java:monitorAndPrintJob(1265)) - Running job: job_local_0001
2013-04-01 20:13:50,294 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:createOutputCommitter(320)) - OutputCommitter set in
config null
2013-04-01 20:13:50,304 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:createOutputCommitter(338)) - OutputCommitter is
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
2013-04-01 20:13:50,322 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:run(386)) - Waiting for map tasks
2013-04-01 20:13:50,324 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:run(213)) - Starting task:
attempt_local_0001_m_000000_0
2013-04-01 20:13:50,375 INFO  mapred.Task (Task.java:initialize(565)) -
 Using ResourceCalculatorPlugin :
org.apache.hadoop.yarn.util.LinuxResourceCalculatorPlugin@4d480ea
2013-04-01 20:13:50,784 INFO  mapred.MapTask
(MapTask.java:setEquator(1127)) - (EQUATOR) 0 kvi 26214396(104857584)
2013-04-01 20:13:50,785 INFO  mapred.MapTask (MapTask.java:<init>(923)) -
mapreduce.task.io.sort.mb: 100
2013-04-01 20:13:50,785 INFO  mapred.MapTask (MapTask.java:<init>(924)) -
soft limit at 83886080
2013-04-01 20:13:50,785 INFO  mapred.MapTask (MapTask.java:<init>(925)) -
bufstart = 0; bufvoid = 104857600
2013-04-01 20:13:50,785 INFO  mapred.MapTask (MapTask.java:<init>(926)) -
kvstart = 26214396; length = 6553600
2013-04-01 20:13:50,952 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:statusUpdate(501)) -
2013-04-01 20:13:50,955 INFO  mapred.MapTask (MapTask.java:flush(1389)) -
Starting flush of map output
2013-04-01 20:13:50,955 INFO  mapred.MapTask (MapTask.java:flush(1408)) -
Spilling map output
2013-04-01 20:13:50,955 INFO  mapred.MapTask (MapTask.java:flush(1409)) -
bufstart = 0; bufend = 4449; bufvoid = 104857600
2013-04-01 20:13:50,955 INFO  mapred.MapTask (MapTask.java:flush(1411)) -
kvstart = 26214396(104857584); kvend = 26214276(104857104); length 121/6553600
2013-04-01 20:13:50,988 INFO  mapred.MapTask
(MapTask.java:sortAndSpill(1597)) - Finished spill 0
2013-04-01 20:13:50,992 INFO  mapred.Task (Task.java:done(979)) -
Task:attempt_local_0001_m_000000_0 is done. And is in the process of
committing
2013-04-01 20:13:51,016 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:statusUpdate(501)) - map
2013-04-01 20:13:51,016 INFO  mapred.Task (Task.java:sendDone(1099)) - Task
'attempt_local_0001_m_000000_0' done.
2013-04-01 20:13:51,017 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:run(238)) - Finishing task:
attempt_local_0001_m_000000_0
2013-04-01 20:13:51,017 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:run(394)) - Map task executor complete.
2013-04-01 20:13:51,032 INFO  mapred.Task (Task.java:initialize(565)) -
 Using ResourceCalculatorPlugin :
org.apache.hadoop.yarn.util.LinuxResourceCalculatorPlugin@40363068
2013-04-01 20:13:51,045 INFO  mapred.Merger (Merger.java:merge(549)) -
Merging 1 sorted segments
2013-04-01 20:13:51,085 INFO  mapred.Merger (Merger.java:merge(648)) - Down
to the last merge-pass, with 1 segments left of total size: 4531 bytes
2013-04-01 20:13:51,085 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:statusUpdate(501)) -
2013-04-01 20:13:51,109 WARN  conf.Configuration
(Configuration.java:warnOnceIfDeprecated(808)) - mapred.skip.on is
deprecated. Instead, use mapreduce.job.skiprecords
2013-04-01 20:13:51,265 INFO  mapred.Task (Task.java:done(979)) -
Task:attempt_local_0001_r_000000_0 is done. And is in the process of
committing
2013-04-01 20:13:51,268 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:statusUpdate(501)) -
2013-04-01 20:13:51,268 INFO  mapred.Task (Task.java:commit(1140)) - Task
attempt_local_0001_r_000000_0 is allowed to commit now
2013-04-01 20:13:51,276 INFO  output.FileOutputCommitter
(FileOutputCommitter.java:commitTask(432)) - Saved output of task
'attempt_local_0001_r_000000_0' to
hdfs://Hadoop01:8040/user/hadoop/d/d2/_temporary/0/task_local_0001_r_000000
2013-04-01 20:13:51,277 INFO  mapred.LocalJobRunner
(LocalJobRunner.java:statusUpdate(501)) - reduce > reduce
2013-04-01 20:13:51,277 INFO  mapred.Task (Task.java:sendDone(1099)) - Task
'attempt_local_0001_r_000000_0' done.
2013-04-01 20:13:51,293 INFO  mapreduce.Job
(Job.java:monitorAndPrintJob(1286)) - Job job_local_0001 running in uber
mode : false
2013-04-01 20:13:51,294 INFO  mapreduce.Job
(Job.java:monitorAndPrintJob(1293)) -  map 100% reduce 100%
2013-04-01 20:13:52,301 INFO  mapreduce.Job
(Job.java:monitorAndPrintJob(1304)) - Job job_local_0001 completed
successfully
2013-04-01 20:13:52,317 INFO  mapreduce.Job
(Job.java:monitorAndPrintJob(1311)) - Counters: 32
File System Counters
FILE: Number of bytes read=56420
FILE: Number of bytes written=379066
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=8728
HDFS: Number of bytes written=4426
HDFS: Number of read operations=13
HDFS: Number of large read operations=0
HDFS: Number of write operations=4
Map-Reduce Framework
Map input records=31
Map output records=31
Map output bytes=4449
Map output materialized bytes=4540
Input split bytes=131 [1]
Combine input records=0
Combine output records=0
Reduce input groups=1
Reduce shuffle bytes=0
Reduce input records=31
Reduce output records=31
Spilled Records=62   [2]
Shuffled Maps =0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=0
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Total committed heap usage (bytes)=366346240
File Input Format Counters
Bytes Read=4364