Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Job does not finish due to java.lang.ClassCastException: class java.util.Date


+
Paul van Hoven 2012-12-18, 16:18
Copy link to this message
-
Re: Job does not finish due to java.lang.ClassCastException: class java.util.Date
Harsh J 2012-12-18, 17:50
Hadoop, out of the box, does not serialize/understand java native
types (such as String, Date, etc.).

You will need to use Writable types [1], or use Avro serialization if
want some work done for free already [2].

[1] - Hadoop: The Definitive Guide, chapter 2 and 4 probably. Also see
http://developer.yahoo.com/hadoop/tutorial/module4.html. Notably also
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/Writable.html
[2] - Apache Avro described at http://avro.apache.org and MR explained
at http://avro.apache.org/docs/1.7.3/api/java/org/apache/avro/mapred/package-summary.html#package_description

On Tue, Dec 18, 2012 at 9:48 PM, Paul van Hoven
<[EMAIL PROTECTED]> wrote:
> Hi,
>
> I wrote a small java program for processing some log files (line after
> line separated by newline). I'm using ecm on amazon to perform my job.
> I wrote the following code:
>
> public static class Map extends Mapper<LongWritable, Text, Date, Object> {
>
>                 private LineParser lineParser = new LineParser();
>
>                 public void map( LongWritable key, Text value, Context context )
> throws IOException, InterruptedException {
>
>                         Click click = lineParser.parseClick( value.toString() );
>                         if( click != null ) {
>                                 context.write( click.timestamp, click );
>                                 return;
>                         }
>
>                         Conversion conversion = lineParser.parseConversion( value.toString() );
>                         if( conversion != null ) {
>                                 context.write( conversion.timestamp, conversion );
>                                 return;
>                         }
>
>                 }
>         }
>
>
>
>         public static class Reduce extends Reducer<Date, Object, Text, Text> {
>
>                 public void reduce( Date key, Iterable<Object> values, Context
> context) throws IOException, InterruptedException {
>
>                         SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss Z");
>                         for( Object obj : values ) {
>                                 if( obj.getClass().getName().equals("myHadoopProject.Click") )
>                                         context.write( new Text( sdf.format( key ) ), new Text( ((Click)
> obj).toHadoopString() ) );
>                                 else
>                                         context.write( new Text( sdf.format( key ) ), new Text( ((Click)
> obj).toHadoopString() ) );
>                         }
>
>                 }
>
>         }
>
>
>
>         public static void main(String[] args) throws Exception {
>                 Configuration conf = new Configuration();
>
>                 Job job = new Job(conf, "MyHadoopJob");
>                 job.setJarByClass(MyHadoopClass.class);
>
>                 job.setOutputKeyClass(Date.class);
>                 job.setOutputValueClass(Object.class);
>
>                 job.setMapperClass(Map.class);
>                 job.setReducerClass(Reduce.class);
>
>                 job.setInputFormatClass(TextInputFormat.class);
>                 job.setOutputFormatClass(TextOutputFormat.class);
>
>                 FileInputFormat.addInputPath( job, new Path( args[0] ) );
>                 FileOutputFormat.setOutputPath( job, new Path( args[1] ) );
>
>                 job.waitForCompletion(true);
>         }
>
>
> Unfortunately I get the following Exception and the job fails:
>
>
> java.lang.ClassCastException: class java.util.Date
>         at java.lang.Class.asSubclass(Class.java:3018)
>         at org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:786)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:975)
>         at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:681)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:375)

Harsh J