Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Sorting text data


Copy link to this message
-
Re: Sorting text data
Hi Sangrova
      Your map method is emitting key values pairs whose type is different than the expected types specified in your driver class. TextInputFormat has LongWritableKeys and TextValues and I believe that is creating the error. As per the code the expected key from a mapper is of BytesWritable since you have specified TextInputFormat the mapper is emitting LongWritable Keys. Using the correct InputFormat should resolve your issue. Since you are using an IdentityMapper you can even give a try by specifying the map output Key and Value types along with the InputFormat.

-inFormat
org.apache.hadoop.mapred.TextInputFormat

java.io.IOException: Type mismatch in key from map: expected
org.apache.hadoop.io.BytesWritable, recieved
org.apache.hadoop.io.LongWritable
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:845)

Regards
Bejoy K S

From handheld, Please excuse typos.

-----Original Message-----
From: sangroya <[EMAIL PROTECTED]>
Date: Wed, 8 Feb 2012 05:59:14
To: <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
Subject: Re: Sorting text data

Hi,

I tried to run the sort example by specifying the input format. But I got
the following error, while running it.
bin/hadoop jar hadoop-0.20.2-examples.jar sort -inFormat
org.apache.hadoop.mapred.TextInputFormat /user/sangroya/test1 outtest16
Running on 1 nodes to sort from hdfs://localhost:54310/user/sangroya/test1
into hdfs://localhost:54310/user/sangroya/outtest16 with 1 reduces.
Job started: Wed Feb 08 14:53:14 CET 2012
12/02/08 14:53:14 INFO mapred.FileInputFormat: Total input paths to process
: 1
12/02/08 14:53:14 INFO mapred.JobClient: Running job: job_201202021340_0030
12/02/08 14:53:15 INFO mapred.JobClient:  map 0% reduce 0%
12/02/08 14:53:27 INFO mapred.JobClient: Task Id :
attempt_201202021340_0030_m_000000_0, Status : FAILED
java.io.IOException: Type mismatch in key from map: expected
org.apache.hadoop.io.BytesWritable, recieved
org.apache.hadoop.io.LongWritable
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:845)
at
org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
at org.apache.hadoop.mapred.lib.IdentityMapper.map(IdentityMapper.java:40)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Can you please suggest me what is the issue.
I also tried the following by specifying everything:
bin/hadoop jar hadoop-0.20.2-examples.jar sort -inFormat
org.apache.hadoop.mapred.TextInputFormat -outFormat
org.apache.hadoop.mapred.TextOutputFormat -outKey org.apache.hadoop.io.Text
-outValue org.apache.hadoop.io.Text /user/sangroya/test1/ outtest11

But still it seems that there is a type mismatch issue.

Running on 1 nodes to sort from hdfs://localhost:54310/user/sangroya/test1
into hdfs://localhost:54310/user/sangroya/outtest88 with 1 reduces.
Job started: Wed Feb 08 14:57:19 CET 2012
12/02/08 14:57:19 INFO mapred.FileInputFormat: Total input paths to process
: 1
12/02/08 14:57:19 INFO mapred.JobClient: Running job: job_201202021340_0031
12/02/08 14:57:20 INFO mapred.JobClient:  map 0% reduce 0%
12/02/08 14:57:33 INFO mapred.JobClient: Task Id :
attempt_201202021340_0031_m_000000_0, Status : FAILED
java.io.IOException: Type mismatch in key from map: expected
org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:845)
at
org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
at org.apache.hadoop.mapred.lib.IdentityMapper.map(IdentityMapper.java:40)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
My input data is a text file.
Please help me out!

Thanks,
Amit

Sangroya
View this message in context: http://lucene.472066.n3.nabble.com/Sorting-text-data-tp3700231p3725997.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.