Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> can i specify no shuffle and/or no sort in the reducer and no disk space left IOException when there is DFS space remaining


Copy link to this message
-
can i specify no shuffle and/or no sort in the reducer and no disk space left IOException when there is DFS space remaining
i have a Mapper and Reducer as a part of a job. all my data transformation
occurs in the mapper, and there is absolutely nothing that needs to be done
in the reducer. when i set the reducer on the Job, i simply use the
Reducer.class.

i notice that after the mapper tasks have reached 100%, then the time until
reducing starts is very long. when reducing starts then i get a
java.io.IOException: No space left on deviceFSError. i checked the dfs
health (via web page), and i still have 42.41% DFS remaining. why does this
occur? i see that eventually 4 attempts are made to call Reducer, however,
they all end up with the IOException mentioned. at the bottom is an output.
notice that the percentage goes up then back down to 0% before the
IOException.

also, i want to know if i can just subclass Reducer or do something about
shuffling and sorting as these steps are not important. i just want each
record emitted from the Mapper to go straight to disk. is it possible to do
this without going through Reducer? i am thinking this is part of the
problem for taking so long between 100% map and the first sign of reduce.

EXAMPLE OUTPUT

12/03/07 22:38:45 INFO mapred.JobClient:  map 98% reduce 0%
12/03/07 22:39:18 INFO mapred.JobClient:  map 99% reduce 0%
12/03/07 22:39:43 INFO mapred.JobClient:  map 100% reduce 0%
12/03/07 22:58:14 INFO mapred.JobClient:  map 100% reduce 1%
12/03/07 22:58:23 INFO mapred.JobClient:  map 100% reduce 3%
12/03/07 22:58:38 INFO mapred.JobClient:  map 100% reduce 6%
12/03/07 22:58:57 INFO mapred.JobClient:  map 100% reduce 7%
12/03/07 22:59:21 INFO mapred.JobClient:  map 100% reduce 9%
12/03/07 23:00:00 INFO mapred.JobClient:  map 100% reduce 10%
12/03/07 23:00:09 INFO mapred.JobClient:  map 100% reduce 12%
12/03/07 23:00:58 INFO mapred.JobClient:  map 100% reduce 0%
12/03/07 23:01:00 INFO mapred.JobClient: Task Id :
attempt_201203071517_0043_r_000000_0, Status : FAILED
FSError: java.io.IOException: No space left on deviceFSError:
java.io.IOException: No space left on deviceFSError: java.io.IOException:
No space left on deviceFSError: java.io.IOException: No space left on
deviceFSError: java.io.IOException: No space left on deviceFSError:
java.io.IOException: No space left on device
attempt_201203071517_0043_r_000000_0: log4j:ERROR Failed to flush writer,
attempt_201203071517_0043_r_000000_0: java.io.IOException: No space left on
device
12/03/07 23:01:31 INFO mapred.JobClient:  map 100% reduce 1%
12/03/07 23:01:34 INFO mapred.JobClient:  map 100% reduce 3%
12/03/07 23:01:37 INFO mapred.JobClient:  map 100% reduce 4%
12/03/07 23:01:49 INFO mapred.JobClient:  map 100% reduce 6%
12/03/07 23:01:55 INFO mapred.JobClient:  map 100% reduce 7%
12/03/07 23:02:19 INFO mapred.JobClient:  map 100% reduce 9%
12/03/07 23:02:52 INFO mapred.JobClient:  map 100% reduce 0%
12/03/07 23:02:54 INFO mapred.JobClient: Task Id :
attempt_201203071517_0043_r_000000_1, Status : FAILED
FSError: java.io.IOException: No space left on deviceFSError:
java.io.IOException: No space left on deviceFSError: java.io.IOException:
No space left on device
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB