Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Issues inserting Hive table in avro format...


Copy link to this message
-
Re: Issues inserting Hive table in avro format...
Also, it may be faster if you just open an issue on haivvreo's issue
tracker at github...

On Wed, Apr 25, 2012 at 10:51 AM, Jakob Homan <[EMAIL PROTECTED]> wrote:
> What version of everything are you running?  There were recent commits
> that fixed some bugs when doing this.
>
> On Wed, Apr 25, 2012 at 7:39 AM, Anson Abraham <[EMAIL PROTECTED]> wrote:
>> In Hive, I'm having issues doing an insert overwrite table to a table that
>> is in avro format.
>>
>> So my existing table (table1) is read from a hdfs directory where the files
>> are in avro format.
>>
>> I created another table table2 in avro format (which is identical in
>> columns, etc...):
>>
>> CREATE EXTERNAL TABLE IF NOT EXISTS table2
>> ROW FORMAT SERDE
>> 'com.linkedin.haivvreo.AvroSerDe'
>> WITH SERDEPROPERTIES (
>> 'schema.literal'='
>> {
>> "type" : "record",
>> "name" : "Record",
>> "namespace" : "GenericData",
>> "doc" : "table2/v1",
>> "fields" : [ {
>> "name" : "index_id",
>> "type" : "long"
>> },{
>> "name" : "id",
>> "type" : [ "null", "long" ],
>> "default" : null
>> } ]
>> }')
>> STORED AS INPUTFORMAT
>> 'com.linkedin.haivvreo.AvroContainerInputFormat'
>> OUTPUTFORMAT
>> 'com.linkedin.haivvreo.AvroContainerOutputFormat'
>> LOCATION 'hdfs://hivenode/table2/';
>>
>>
>> So when I did a
>>
>> insert overwrite table table2
>> select * from table1
>>
>>
>> I get an error:
>>
>> Ended Job = job_xxxxx_0123 with errors
>>
>> FAILED: Execution Error, return code 2 from
>> org.apache.hadoop.hive.ql.exec.MapRedTask
>>
>>
>> Looking closely i see this:"
>>
>> at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:231)
>>       at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
>>       at
>> org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:127)
>>       at
>> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:169)
>>       at
>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144)
>>       at
>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:135)
>>       at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233)
>>       at org.apache.avro.file.DataFileStream.next(DataFileStream.java:220)
>>       at
>> com.linkedin.haivvreo.AvroGenericRecordReader.next(AvroGenericRecordReader.java:116)
>>       at
>> com.linkedin.haivvreo.AvroGenericRecordReader.next(AvroGenericRecordReader.java:41)
>>       at
>> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:98)
>>       at
>> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:42)
>>       at
>> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:67)
>>       at
>> org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileRecordReader.next(Hadoop20SShims.java:208)
>>       at
>> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:208)
>>       at
>> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:193)
>>       at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>>       at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
>>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>>       at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>>       at java.security.AccessController.doPrivileged(Native Method)
>>       at javax.security.auth.Subject.doAs(Subject.java:396)
>>       at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
>>       at org.apache.hadoop.mapred.Child.main(Child.java:264)
>>
>> "
>>
>> We're using
>>
>> haivvreo
>>
>> https://github.com/jghoman/haivvreo
>>
>>
>> Anyone have any ideas?