Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Issues inserting Hive table in avro format...


+
Anson Abraham 2012-04-25, 14:39
+
Jakob Homan 2012-04-25, 17:51
Copy link to this message
-
Re: Issues inserting Hive table in avro format...
Also, it may be faster if you just open an issue on haivvreo's issue
tracker at github...

On Wed, Apr 25, 2012 at 10:51 AM, Jakob Homan <[EMAIL PROTECTED]> wrote:
> What version of everything are you running?  There were recent commits
> that fixed some bugs when doing this.
>
> On Wed, Apr 25, 2012 at 7:39 AM, Anson Abraham <[EMAIL PROTECTED]> wrote:
>> In Hive, I'm having issues doing an insert overwrite table to a table that
>> is in avro format.
>>
>> So my existing table (table1) is read from a hdfs directory where the files
>> are in avro format.
>>
>> I created another table table2 in avro format (which is identical in
>> columns, etc...):
>>
>> CREATE EXTERNAL TABLE IF NOT EXISTS table2
>> ROW FORMAT SERDE
>> 'com.linkedin.haivvreo.AvroSerDe'
>> WITH SERDEPROPERTIES (
>> 'schema.literal'='
>> {
>> "type" : "record",
>> "name" : "Record",
>> "namespace" : "GenericData",
>> "doc" : "table2/v1",
>> "fields" : [ {
>> "name" : "index_id",
>> "type" : "long"
>> },{
>> "name" : "id",
>> "type" : [ "null", "long" ],
>> "default" : null
>> } ]
>> }')
>> STORED AS INPUTFORMAT
>> 'com.linkedin.haivvreo.AvroContainerInputFormat'
>> OUTPUTFORMAT
>> 'com.linkedin.haivvreo.AvroContainerOutputFormat'
>> LOCATION 'hdfs://hivenode/table2/';
>>
>>
>> So when I did a
>>
>> insert overwrite table table2
>> select * from table1
>>
>>
>> I get an error:
>>
>> Ended Job = job_xxxxx_0123 with errors
>>
>> FAILED: Execution Error, return code 2 from
>> org.apache.hadoop.hive.ql.exec.MapRedTask
>>
>>
>> Looking closely i see this:"
>>
>> at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:231)
>>       at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
>>       at
>> org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:127)
>>       at
>> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:169)
>>       at
>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144)
>>       at
>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:135)
>>       at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233)
>>       at org.apache.avro.file.DataFileStream.next(DataFileStream.java:220)
>>       at
>> com.linkedin.haivvreo.AvroGenericRecordReader.next(AvroGenericRecordReader.java:116)
>>       at
>> com.linkedin.haivvreo.AvroGenericRecordReader.next(AvroGenericRecordReader.java:41)
>>       at
>> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:98)
>>       at
>> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:42)
>>       at
>> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:67)
>>       at
>> org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileRecordReader.next(Hadoop20SShims.java:208)
>>       at
>> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:208)
>>       at
>> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:193)
>>       at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>>       at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
>>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>>       at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>>       at java.security.AccessController.doPrivileged(Native Method)
>>       at javax.security.auth.Subject.doAs(Subject.java:396)
>>       at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
>>       at org.apache.hadoop.mapred.Child.main(Child.java:264)
>>
>> "
>>
>> We're using
>>
>> haivvreo
>>
>> https://github.com/jghoman/haivvreo
>>
>>
>> Anyone have any ideas?
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB