Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Many Errors at the last step of copying files from _temporary to Output Directory


Copy link to this message
-
Many Errors at the last step of copying files from _temporary to Output Directory
Hi

My environment is like this

INPUT FILES
=========400 GZIP files , one from each server - average size gzipped 25MB

REDUCER
======Uses MultipleOutput

OUTPUT  (Snappy)
======/path/to/output/dir1
/path/to/output/dir2
/path/to/output/dir3
/path/to/output/dir4

Number of output directories = 1600
Number of output files = 17000

SETTINGS
========Maximum Number of Transfer Threads
dfs.datanode.max.xcievers, dfs.datanode.max.transfer.threads  = 16384

ERRORS
======I am getting errors consistently at the last step of  copying files from _temporary to Output Directory.

ERROR 1
======BADF: Bad file descriptor
at org.apache.hadoop.io.nativeio.NativeIO.posix_fadvise(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO.posixFadviseIfPossible(NativeIO.java:145)
at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:205)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
ERROR 2
======2013-06-13 23:35:15,902 WARN [main] org.apache.hadoop.hdfs.DFSClient: Failed to connect to /10.28.21.171:50010 for block, add to deadNodes and continue. java.io.IOException: Got error for OP_READ_BLOCK, self=/10.28.21.171:57436, remote=/10.28.21.171:50010, for file /user/nextag/oozie-workflows/config/aggregations.conf, for pool BP-64441488-10.28.21.167-1364511907893 block 213045727251858949_8466884
java.io.IOException: Got error for OP_READ_BLOCK, self=/10.28.21.171:57436, remote=/10.28.21.171:50010, for file /user/nextag/oozie-workflows/config/aggregations.conf, for pool BP-64441488-10.28.21.167-1364511907893 block 213045727251858949_8466884
at org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:444)
at org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:409)
at org.apache.hadoop.hdfs.BlockReaderFactory.newBlockReader(BlockReaderFactory.java:105)
at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:937)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689)
at java.io.DataInputStream.read(DataInputStream.java:132)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158)
at java.io.InputStreamReader.read(InputStreamReader.java:167)
at java.io.BufferedReader.fill(BufferedReader.java:136)
at java.io.BufferedReader.readLine(BufferedReader.java:299)
at java.io.BufferedReader.readLine(BufferedReader.java:362)
at com.wizecommerce.utils.mapred.HdfsUtils.readFileIntoList(HdfsUtils.java:83)
at com.wizecommerce.utils.mapred.HdfsUtils.getConfigParamMap(HdfsUtils.java:214)
at com.wizecommerce.utils.mapred.NextagFileOutputFormat.getOutputPath(NextagFileOutputFormat.java:171)
at com.wizecommerce.utils.mapred.NextagFileOutputFormat.getOutputCommitter(NextagFileOutputFormat.java:330)
at com.wizecommerce.utils.mapred.NextagFileOutputFormat.getDefaultWorkFile(NextagFileOutputFormat.java:306)
at com.wizecommerce.utils.mapred.NextagTextOutputFormat.getRecordWriter(NextagTextOutputFormat.java:111)
at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:413)
at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:395)
at com.wizecommerce.parser.mapred.OutpdirImpressionLogReducer.writePtitleExplanationBlob(OutpdirImpressionLogReducer.java:337)
at com.wizecommerce.parser.mapred.OutpdirImpressionLogReducer.processPTitle(OutpdirImpressionLogReducer.java:171)
at com.wizecommerce.parser.mapred.OutpdirImpressionLogReducer.reduce(OutpdirImpressionLogReducer.java:91)
at com.wizecommerce.parser.mapred.OutpdirImpressionLogReducer.reduce(OutpdirImpressionLogReducer.java:24)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:636)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:396)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
Thanks
Sanjay
CONFIDENTIALITY NOTICE
=====================This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB