Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - InputSplits, Serializers in Hadoop 0.20


Copy link to this message
-
InputSplits, Serializers in Hadoop 0.20
Saptarshi Guha 2009-08-10, 15:49
Hello,
In my custom inputformat written using the new Hadoop 0.20 API, I get
rhe following error
at org.apache.hadoop.io.serializer.SerializationFactory.getSerializer(SerializationFactory.java:73)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:899)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
The code in writeNewSplits which causes this is the last line

...
 try {
      if (array.length != 0) {
        DataOutputBuffer buffer = new DataOutputBuffer();
        RawSplit rawSplit = new RawSplit();
        SerializationFactory factory = new SerializationFactory(conf);

        Serializer<T> serializer           factory.getSerializer((Class<T>) array[0].getClass());
...

My InputSplit format has the read and write methods, but I can't quite
figure out what is causing this error.

Thank you in advance
Saptarshi