|
|
+
Anson Abraham 2012-04-25, 14:39
+
Jakob Homan 2012-04-25, 17:51
-
Re: Issues inserting Hive table in avro format...Jakob Homan 2012-04-25, 17:52
Also, it may be faster if you just open an issue on haivvreo's issue
tracker at github... On Wed, Apr 25, 2012 at 10:51 AM, Jakob Homan <[EMAIL PROTECTED]> wrote: > What version of everything are you running? There were recent commits > that fixed some bugs when doing this. > > On Wed, Apr 25, 2012 at 7:39 AM, Anson Abraham <[EMAIL PROTECTED]> wrote: >> In Hive, I'm having issues doing an insert overwrite table to a table that >> is in avro format. >> >> So my existing table (table1) is read from a hdfs directory where the files >> are in avro format. >> >> I created another table table2 in avro format (which is identical in >> columns, etc...): >> >> CREATE EXTERNAL TABLE IF NOT EXISTS table2 >> ROW FORMAT SERDE >> 'com.linkedin.haivvreo.AvroSerDe' >> WITH SERDEPROPERTIES ( >> 'schema.literal'=' >> { >> "type" : "record", >> "name" : "Record", >> "namespace" : "GenericData", >> "doc" : "table2/v1", >> "fields" : [ { >> "name" : "index_id", >> "type" : "long" >> },{ >> "name" : "id", >> "type" : [ "null", "long" ], >> "default" : null >> } ] >> }') >> STORED AS INPUTFORMAT >> 'com.linkedin.haivvreo.AvroContainerInputFormat' >> OUTPUTFORMAT >> 'com.linkedin.haivvreo.AvroContainerOutputFormat' >> LOCATION 'hdfs://hivenode/table2/'; >> >> >> So when I did a >> >> insert overwrite table table2 >> select * from table1 >> >> >> I get an error: >> >> Ended Job = job_xxxxx_0123 with errors >> >> FAILED: Execution Error, return code 2 from >> org.apache.hadoop.hive.ql.exec.MapRedTask >> >> >> Looking closely i see this:" >> >> at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:231) >> at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) >> at >> org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:127) >> at >> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:169) >> at >> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144) >> at >> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:135) >> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233) >> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:220) >> at >> com.linkedin.haivvreo.AvroGenericRecordReader.next(AvroGenericRecordReader.java:116) >> at >> com.linkedin.haivvreo.AvroGenericRecordReader.next(AvroGenericRecordReader.java:41) >> at >> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:98) >> at >> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:42) >> at >> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:67) >> at >> org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileRecordReader.next(Hadoop20SShims.java:208) >> at >> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:208) >> at >> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:193) >> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) >> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) >> at org.apache.hadoop.mapred.Child$4.run(Child.java:270) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:396) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) >> at org.apache.hadoop.mapred.Child.main(Child.java:264) >> >> " >> >> We're using >> >> haivvreo >> >> https://github.com/jghoman/haivvreo >> >> >> Anyone have any ideas? |