Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Issues inserting Hive table in avro format...


Copy link to this message
-
Re: Issues inserting Hive table in avro format...
Jakob Homan 2012-04-25, 17:51
What version of everything are you running?  There were recent commits
that fixed some bugs when doing this.

On Wed, Apr 25, 2012 at 7:39 AM, Anson Abraham <[EMAIL PROTECTED]> wrote:
> In Hive, I'm having issues doing an insert overwrite table to a table that
> is in avro format.
>
> So my existing table (table1) is read from a hdfs directory where the files
> are in avro format.
>
> I created another table table2 in avro format (which is identical in
> columns, etc...):
>
> CREATE EXTERNAL TABLE IF NOT EXISTS table2
> ROW FORMAT SERDE
> 'com.linkedin.haivvreo.AvroSerDe'
> WITH SERDEPROPERTIES (
> 'schema.literal'='
> {
> "type" : "record",
> "name" : "Record",
> "namespace" : "GenericData",
> "doc" : "table2/v1",
> "fields" : [ {
> "name" : "index_id",
> "type" : "long"
> },{
> "name" : "id",
> "type" : [ "null", "long" ],
> "default" : null
> } ]
> }')
> STORED AS INPUTFORMAT
> 'com.linkedin.haivvreo.AvroContainerInputFormat'
> OUTPUTFORMAT
> 'com.linkedin.haivvreo.AvroContainerOutputFormat'
> LOCATION 'hdfs://hivenode/table2/';
>
>
> So when I did a
>
> insert overwrite table table2
> select * from table1
>
>
> I get an error:
>
> Ended Job = job_xxxxx_0123 with errors
>
> FAILED: Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.MapRedTask
>
>
> Looking closely i see this:"
>
> at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:231)
> at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
> at
> org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:127)
> at
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:169)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:135)
> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233)
> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:220)
> at
> com.linkedin.haivvreo.AvroGenericRecordReader.next(AvroGenericRecordReader.java:116)
> at
> com.linkedin.haivvreo.AvroGenericRecordReader.next(AvroGenericRecordReader.java:41)
> at
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:98)
> at
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:42)
> at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:67)
> at
> org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileRecordReader.next(Hadoop20SShims.java:208)
> at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:208)
> at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:193)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
> at org.apache.hadoop.mapred.Child.main(Child.java:264)
>
> "
>
> We're using
>
> haivvreo
>
> https://github.com/jghoman/haivvreo
>
>
> Anyone have any ideas?