Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - OutputValueGroupingComparator gets strange inputs (topic changed from "Logs cannot be created")


Copy link to this message
-
Re: OutputValueGroupingComparator gets strange inputs (topic changed from "Logs cannot be created")
Bertrand Dechoux 2012-08-09, 14:57
I am just curious but are you using Writable? If so there is a
WritableComparator...
If you are going to interpret every bytes (you create a String, so you do),
there no clear reason for choosing such a low level API.

Regards

Bertrand

On Thu, Aug 9, 2012 at 4:47 PM, Björn-Elmar Macek <[EMAIL PROTECTED]>wrote:

> Hi again,
>
> this is an direct response to my previous posting with the title "Logs
> cannot be created", where logs could not be created (Spill failed). I got
> the hint, that i gotta check privileges, but that was not the problem,
> because i own the folders that were used for this.
>
> I finally found an important hint in a log saying:
> 12/08/09 15:30:49 WARN mapred.JobClient: Error reading task outputhttp://
> its-cs229.its.**uni-kassel.de:50060/tasklog?**plaintext=true&attemptid=**
> attempt_201208091516_0001_m_**000048_0&filter=stdout<http://its-cs229.its.uni-kassel.de:50060/tasklog?plaintext=true&attemptid=attempt_201208091516_0001_m_000048_0&filter=stdout>
> 12/08/09 15:30:49 WARN mapred.JobClient: Error reading task outputhttp://
> its-cs229.its.**uni-kassel.de:50060/tasklog?**plaintext=true&attemptid=**
> attempt_201208091516_0001_m_**000048_0&filter=stderr<http://its-cs229.its.uni-kassel.de:50060/tasklog?plaintext=true&attemptid=attempt_201208091516_0001_m_000048_0&filter=stderr>
> 12/08/09 15:34:34 INFO mapred.JobClient: Task Id :
> attempt_201208091516_0001_m_**000055_0, Status : FAILED
> java.io.IOException: Spill failed
>         at org.apache.hadoop.mapred.**MapTask$MapOutputBuffer.**
> collect(MapTask.java:1029)
>         at org.apache.hadoop.mapred.**MapTask$OldOutputCollector.**
> collect(MapTask.java:592)
>         at uni.kassel.macek.rtprep.**RetweetMapper.map(**
> RetweetMapper.java:26)
>         at uni.kassel.macek.rtprep.**RetweetMapper.map(**
> RetweetMapper.java:12)
>         at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
>         at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**
> java:436)
>         at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:372)
>         at org.apache.hadoop.mapred.**Child$4.run(Child.java:255)
>         at java.security.**AccessController.doPrivileged(**Native Method)
>         at javax.security.auth.Subject.**doAs(Subject.java:396)
>         at org.apache.hadoop.security.**UserGroupInformation.doAs(**
> UserGroupInformation.java:**1093)
>         at org.apache.hadoop.mapred.**Child.main(Child.java:249)
> Caused by: java.lang.**NumberFormatException: For input string: ""
>         at java.lang.**NumberFormatException.**forInputString(**
> NumberFormatException.java:48)
>         at java.lang.Integer.parseInt(**Integer.java:468)
>         at java.lang.Integer.parseInt(**Integer.java:497)
>         at uni.kassel.macek.rtprep.Tweet.**getRT(Tweet.java:126)
>         at uni.kassel.macek.rtprep.**TwitterValueGroupingComparator**
> .compare(**TwitterValueGroupingComparator**.java:47)
>         at org.apache.hadoop.mapred.**MapTask$MapOutputBuffer.**
> compare(MapTask.java:1111)
>         at org.apache.hadoop.util.**QuickSort.sortInternal(**
> QuickSort.java:95)
>         at org.apache.hadoop.util.**QuickSort.sort(QuickSort.java:**59)
>         at org.apache.hadoop.mapred.**MapTask$MapOutputBuffer.**
> sortAndSpill(MapTask.java:**1399)
>         at org.apache.hadoop.mapred.**MapTask$MapOutputBuffer.**
> access$1800(MapTask.java:853)
>         at org.apache.hadoop.mapred.**MapTask$MapOutputBuffer$**
> SpillThread.run(MapTask.java:**1344)
>
>
>
> corresponding to the following lines of code within the class
> TwitterValueGroupingComparator**:
>
> public class TwitterValueGroupingComparator implements RawComparator<Text>
> {
> ...
>     public int compare(byte[] text1, int start1, int length1, byte[] text2,
>         int start2, int length2) {
>
>     byte[] tweet1 = new byte[length1];// length1-1 (???)
>     byte[] tweet2 = new byte[length2];// length1-1 (???)
>
>     System.arraycopy(text1, start1, tweet1, 0, length1);// start1+1 (???)
Bertrand Dechoux