Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # dev >> Checksum Error during Reduce Phase hadoop-1.0.2


Copy link to this message
-
Re: Checksum Error during Reduce Phase hadoop-1.0.2
Hi Pavan,

Do you see this happen on a specific node every time (i.e. when the
reducer runs there)?

On Fri, Aug 10, 2012 at 11:43 PM, Pavan Kulkarni
<[EMAIL PROTECTED]> wrote:
> Hi,
>
>  I am running a Terasort with a cluster of 8 nodes.The map phase completes
> but when the reduce phase is around 68-70% I get this following error.
>
> *
> 12/08/10 11:02:36 INFO mapred.JobClient: Task Id :
> attempt_201208101018_0001_r_000027_0, Status : FAILED
> java.lang.RuntimeException: problem advancing post rec#38320220
> *
> *        at
> org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1214)*
> *        at
> org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:249)
> *
> *        at
> org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:245)
> *
> *        at
> org.apache.hadoop.mapred.lib.IdentityReducer.reduce(IdentityReducer.java:40)
> *
> *        at
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:519)*
> *        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)*
> *        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)*
> *        at java.security.AccessController.doPrivileged(Native Method)*
> *        at javax.security.auth.Subject.doAs(Subject.java:416)*
> *        at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
> *
> *        at org.apache.hadoop.mapred.Child.main(Child.java:249)*
> *Caused by: org.apache.hadoop.fs.ChecksumException: Checksum Error*
> *        at
> org.apache.hadoop.mapred.IFileInputStream.doRead(IFileInputStream.java:164)*
> *        at
> org.apache.hadoop.mapred.IFileInputStream.read(IFileInputStream.java:101)*
> *        at org.apache.hadoop.mapred.IFile$Reader.readData(IFile.java:328)*
> *        at org.apache.hadoop.mapred.IFile$Reader.rejigData(IFile.java:358)*
> *        at
> org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:342)*
> *        at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:374)*
> *        at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:220)*
> *        at
> org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:330)
> *
> *        at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
> *
> *        at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$RawKVIteratorReader.next(ReduceTask.java:2531)
> *
> *        at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:220)*
> *        at
> org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:330)
> *
> *        at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
> *
> *        at
> org.apache.hadoop.mapred.Task$ValuesIterator.readNextKey(Task.java:1253)*
> *        at
> org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1212)*
> *        ... 10 more*
>
> I came across somone facing the same
> issue<http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201001.mbox/%[EMAIL PROTECTED]%3E>in
> the mail-archives and he seemed to resolve it by listing hostnames in
> the */etc/hosts *file,
> but all my nodes have correct info about the hostnames in /etc/hosts, but I
> still have these reducers throwing error.
> Any help regarding this issue is appreciated .Thanks
>
> --
>
> --With Regards
> Pavan Kulkarni

--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB