Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - how to figure out the range of a split that failed?


Copy link to this message
-
Re: how to figure out the range of a split that failed?
edward choi 2010-06-30, 05:20
Thanks for the quick response.
I know the SkipBadRecords feature but unfortunately I cannot use it since I
am running my job on Hadoop Streaming.
I had asked if there were any way to use SkipBadRecords in Hadoop Streaming
but never got an answer. I guess it is not possible at all.
Thanks for your concern.

2010/6/30 Hemanth Yamijala <[EMAIL PROTECTED]>

> Hi,
>
> > I am running a mapreduce job on my hadoop cluster.
> >
> > I am running a 10 gigabytes data and one tiny failed task crashes the
> whole
> > operation.
> > I am up to 98% complete and throwing away all the finished data seems
> just
> > like an awful waste.
> > I'd like to save the finished data and run again only the failed ones(the
> > remaining 2%).
> >
> > Is there any way to figure out the range of the splits that failed?
> > I go to "localhost:50030" to see if I can find any useful information but
> I
> > must be looking at wrong places.
>
> Can you check the 'Skip Bad records' feature mentioned here and see if
> that helps:
> http://hadoop.apache.org/common/docs/r0.20.1/mapred_tutorial.html#Skipping+Bad+Records
> ?
>
> Thanks
> Hemanth
>
> >
> > Could somebody help me with this problem?
> >
> >
> > Below is the log of a failed task. Any information I can use?
> >
> > *syslog logs*
> >
> > Records R/W=41707/41639
> > 2010-06-30 07:35:30,530 INFO org.apache.hadoop.streaming.PipeMapRed:
> > Records R/W=41776/41726
> > 2010-06-30 07:35:40,554 INFO org.apache.hadoop.streaming.PipeMapRed:
> > Records R/W=41865/41804
> > 2010-06-30 07:35:50,559 INFO org.apache.hadoop.streaming.PipeMapRed:
> > Records R/W=41970/41932
> > 2010-06-30 07:36:00,637 INFO org.apache.hadoop.streaming.PipeMapRed:
> > Records R/W=42073/42065
> > 2010-06-30 07:36:10,772 INFO org.apache.hadoop.streaming.PipeMapRed:
> > Records R/W=42258/42196
> > 2010-06-30 07:36:20,785 INFO org.apache.hadoop.streaming.PipeMapRed:
> > Records R/W=42318/42274
> > 2010-06-30 07:36:30,985 INFO org.apache.hadoop.streaming.PipeMapRed:
> > Records R/W=42378/42351
> > 2010-06-30 07:36:41,005 INFO org.apache.hadoop.streaming.PipeMapRed:
> > Records R/W=42442/42419
> > 2010-06-30 07:36:51,149 INFO org.apache.hadoop.streaming.PipeMapRed:
> > Records R/W=42499/42484
> > 2010-06-30 07:37:01,235 INFO org.apache.hadoop.streaming.PipeMapRed:
> > Records R/W=42559/42547
> > 2010-06-30 07:37:11,242 INFO org.apache.hadoop.streaming.PipeMapRed:
> > Records R/W=42626/42611
> > 2010-06-30 07:37:21,485 INFO org.apache.hadoop.streaming.PipeMapRed:
> > Records R/W=42769/42704
> > 2010-06-30 07:37:31,617 INFO org.apache.hadoop.streaming.PipeMapRed:
> > Records R/W=42845/42782
> > 2010-06-30 07:37:41,725 INFO org.apache.hadoop.streaming.PipeMapRed:
> > Records R/W=42915/42875
> > 2010-06-30 07:37:51,733 INFO org.apache.hadoop.streaming.PipeMapRed:
> > Records R/W=42986/42949
> > 2010-06-30 07:38:01,795 INFO org.apache.hadoop.streaming.PipeMapRed:
> > Records R/W=43070/43051
> > 2010-06-30 07:38:11,849 INFO org.apache.hadoop.streaming.PipeMapRed:
> > Records R/W=43138/43136
> > 2010-06-30 07:38:22,398 INFO org.apache.hadoop.streaming.PipeMapRed:
> > Records R/W=43258/43200
> > 2010-06-30 07:38:31,642 INFO org.apache.hadoop.streaming.PipeMapRed:
> > MRErrorThread done
> > 2010-06-30 07:38:31,643 INFO org.apache.hadoop.streaming.PipeMapRed:
> > MROutputThread done
> > 2010-06-30 07:38:31,765 INFO org.apache.hadoop.streaming.PipeMapRed:
> log:null
> > R/W/S=43335/43271/0 in:7=43335/5885 [rec/s] out:7=43271/5885 [rec/s]
> > minRecWrittenToEnableSkip_=9223372036854775807 LOGNAME=null
> > HOST=null
> > USER=hadoop
> > HADOOP_USER=null
> > last Hadoop input: |null|
> > last tool output: |[B@d22860|
> > Date: Wed Jun 30 07:38:31 KST 2010
> > java.io.IOException: Broken pipe
> >        at java.io.FileOutputStream.writeBytes(Native Method)
> >        at java.io.FileOutputStream.write(FileOutputStream.java:260)
> >        at
> java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
> >        at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)