Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Where does Hadoop store its maps?


Copy link to this message
-
Where does Hadoop store its maps?
Hi,

I am using a Hadoop cluster of my own construction on EC2, and I am running
out of hard drive space with maps. If I knew which directories are used by
Hadoop for map spill, I could use the large ephemeral drive on EC2 machines
for that. Otherwise, I would have to keep increasing my available hard
drive on root, and that's not very smart.

Thank you. The error I get is below.

Sincerely,
Mark

org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
any valid local directory for output/file.out
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:376)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
at org.apache.hadoop.mapred.MapOutputFile.getOutputFileForWrite(MapOutputFile.java:69)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1495)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1180)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:582)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:649)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs
java.io.IOException: Spill failed
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:886)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:574)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at org.freeeed.main.ZipFileProcessor.emitAsMap(ZipFileProcessor.java:279)
at org.freeeed.main.ZipFileProcessor.processWithTrueZip(ZipFileProcessor.java:107)
at org.freeeed.main.ZipFileProcessor.process(ZipFileProcessor.java:55)
at org.freeeed.main.Map.map(Map.java:70)
at org.freeeed.main.Map.map(Map.java:24)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(User
java.io.IOException: Spill failed
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:886)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:574)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at org.freeeed.main.ZipFileProcessor.emitAsMap(ZipFileProcessor.java:279)
at org.freeeed.main.ZipFileProcessor.processWithTrueZip(ZipFileProcessor.java:107)
at org.freeeed.main.ZipFileProcessor.process(ZipFileProcessor.java:55)
at org.freeeed.main.Map.map(Map.java:70)
at org.freeeed.main.Map.map(Map.java:24)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(User
org.apache.hadoop.io.SecureIOUtils$AlreadyExistsException: EEXIST: File exists
at org.apache.hadoop.io.SecureIOUtils.createForWrite(SecureIOUtils.java:178)
at org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:292)
at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:365)
at org.apache.hadoop.mapred.Child$4.run(Child.java:272)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
Caused by: EEXIST: File exists
at org.apache.hadoop.io.nativeio.NativeIO.open(Native Method)
at org.apache.hadoop.io.SecureIOUtils.createForWrite(SecureIOUtils.java:172)
... 7 more
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB