Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # user >> How to read with SpecificDatumReader


+
Alan Miller 2012-12-20, 15:21
Copy link to this message
-
Re: How to read with SpecificDatumReader
It looks to me like in your non-Hadoop application
com.company.app.MyRecord is not on the classpath.

Doug

On Thu, Dec 20, 2012 at 7:21 AM, Alan Miller <[EMAIL PROTECTED]> wrote:
> I can write my Avro data fine, but how do I read my data records with the
> SpecificDatum reader?
>
> Basically, I write my (hdfs) data file like this:
> Schema schema = new MyRecord().getSchema();
> DatumWriter<MyRecord> writer = new SpecificDatumWriter<MyRecord>(schema);
> DataFileWriter<MyRecord> dataFileWriter = new
> DataFileWriter<MyRecord>(writer);
> FSDataOutputStream fos = fs.create(avroPath);
> dataFileWriter.create(schema, fos);
> for (MyRecord r : map.values()) {
> dataFileWriter.flush();
> dataFileWriter.append(r);
> }
> dataFileWriter.flush();
>
> This works fine because my MR job processes the generated files via
>      Job job = new Job(config, jobName);
>      job.setJarByClass(getClass());
>         AvroJob.setInputKeySchema(job, schema);
>      AvroJob.setInputValueSchema(job, schema);
>         job.setInputFormatClass(AvroKeyInputFormat.class);
>         job.setMapperClass(MyMapper.class);
>
> Now I need to read the file from a different (non-Hadoop) application but
> when I try to read the data like this:
> 596 DatumReader<MyRecord> myDatumReader = new
> SpecificDatumReader<MyRecord>(MyRecord.class);
> 597 DataFileReader<MyRecord> dataFileReader = new
> DataFileReader<MyRecord>(localFile, myDatumReader);
> 598 MyRecord record = null;
> 599 String owner = null;
> 600 while (dataFileReader.hasNext()) {
> 601 record = dataFileReader.next(record);
> 602 owner = record.getOwner().toString();
> 603 System.out.printf("owner = %s\n", owner);
> 604 }
> 605 dataFileReader.close();
>
> I get this error:
> Exception in thread "main" java.lang.ClassCastException:
> org.apache.avro.generic.GenericData$Record cannot be cast to
> com.company.app.MyRecord
> at com.company.app.MyDriver.readAvroData(MyDriver.java:601)
> at com.company.app.MyDriver.main(MyDriver.java:1378)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
> Alan
+
Alan Miller 2012-12-21, 13:54
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB