Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> OutputValueGroupingComparator gets strange inputs (topic changed from "Logs cannot be created")


Copy link to this message
-
Re: OutputValueGroupingComparator gets strange inputs (topic changed from "Logs cannot be created")
I would recommend you to look at the yahoo tutorial for more information.

Here is the part we are discussing about :
http://developer.yahoo.com/hadoop/tutorial/module5.html#writable-comparator

Regards

Bertrand

On Thu, Aug 9, 2012 at 5:03 PM, Björn-Elmar Macek <[EMAIL PROTECTED]>wrote:

>  Hi Bertrand,
>
> i am using RawComperator because this one was used in the tutorial of some
> famous (hadoop) guy describing how to sort the input for the reducer. Is
> there an easier alternative?
>
>
> Am 09.08.2012 16:57, schrieb Bertrand Dechoux:
>
> I am just curious but are you using Writable? If so there is a
> WritableComparator...
> If you are going to interpret every bytes (you create a String, so you
> do), there no clear reason for choosing such a low level API.
>
> Regards
>
> Bertrand
>
> On Thu, Aug 9, 2012 at 4:47 PM, Björn-Elmar Macek <[EMAIL PROTECTED]>wrote:
>
>> Hi again,
>>
>> this is an direct response to my previous posting with the title "Logs
>> cannot be created", where logs could not be created (Spill failed). I got
>> the hint, that i gotta check privileges, but that was not the problem,
>> because i own the folders that were used for this.
>>
>> I finally found an important hint in a log saying:
>> 12/08/09 15:30:49 WARN mapred.JobClient: Error reading task outputhttp://
>> its-cs229.its.uni-kassel.de:50060/tasklog?plaintext=true&attemptid=attempt_201208091516_0001_m_000048_0&filter=stdout
>> 12/08/09 15:30:49 WARN mapred.JobClient: Error reading task outputhttp://
>> its-cs229.its.uni-kassel.de:50060/tasklog?plaintext=true&attemptid=attempt_201208091516_0001_m_000048_0&filter=stderr
>> 12/08/09 15:34:34 INFO mapred.JobClient: Task Id :
>> attempt_201208091516_0001_m_000055_0, Status : FAILED
>> java.io.IOException: Spill failed
>>         at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1029)
>>         at
>> org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:592)
>>         at
>> uni.kassel.macek.rtprep.RetweetMapper.map(RetweetMapper.java:26)
>>         at
>> uni.kassel.macek.rtprep.RetweetMapper.map(RetweetMapper.java:12)
>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
>>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> Caused by: java.lang.NumberFormatException: For input string: ""
>>         at
>> java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
>>         at java.lang.Integer.parseInt(Integer.java:468)
>>         at java.lang.Integer.parseInt(Integer.java:497)
>>         at uni.kassel.macek.rtprep.Tweet.getRT(Tweet.java:126)
>>         at
>> uni.kassel.macek.rtprep.TwitterValueGroupingComparator.compare(TwitterValueGroupingComparator.java:47)
>>         at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.compare(MapTask.java:1111)
>>         at
>> org.apache.hadoop.util.QuickSort.sortInternal(QuickSort.java:95)
>>         at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:59)
>>         at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1399)
>>         at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:853)
>>         at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1344)
>>
>>
>>
>> corresponding to the following lines of code within the class
>> TwitterValueGroupingComparator:
>>
>> public class TwitterValueGroupingComparator implements
>> RawComparator<Text> {
>> ...
>>     public int compare(byte[] text1, int start1, int length1, byte[]
Bertrand Dechoux
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB