|
|
-
NPE with Generic/SpecifcDatumWriter in Avro 1.3.3Lewis John Mcgibbney 2012-12-07, 16:39
Hi,
We have an issue over in Nutch where we are trying to inject urls into an Avro backed file store (which resides in Gora [0]). The schema we are using to generate the Java classes to store the data can be found here [1]. Currently when I use the Nutch Inject tool (a MR job which reads a flat file of URLs adding metadata then stores these into the file store) I get the following stack trace java.lang.NullPointerException at org.apache.avro.specific.SpecificDatumWriter.getField(SpecificDatumWriter.java:48) at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:89) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:62) at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:89) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:62) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:55) at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:245) at org.apache.gora.avro.store.DataFileAvroStore.put(DataFileAvroStore.java:54) at org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:60) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:185) at org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:85) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) at org.apache.hadoop.mapred.Child.main(Child.java:249) So I guess I have the following questions 1) Is the schema OK? Is there anything which should be changed? 2) If not then can someone please explain to me (if possible) how we get the NPE and what field within the schema might relate to this? I am keen to learn more about this, any help in order to do so would be greatly appreciated. Thanks, Lewis [0] http://svn.apache.org/repos/asf/gora/trunk/gora-core/src/main/java/org/apache/gora/avro/store/DataFileAvroStore.java [1] https://issues.apache.org/jira/secure/attachment/12559852/webpage.avsc [2] http://svn.apache.org/repos/asf/nutch/branches/2.x/src/java/org/apache/nutch/crawl/InjectorJob.java -- Lewis +
Lewis John Mcgibbney 2012-12-07, 17:19
|