Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # user >> Re: Problem: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 64 / avro.io.SchemaResolutionException: Can't access branch index 64 for union with 2 branches / `read_data': Writer's schema and Reader's schema ["string","null"] do not mat


+
Russell Jurney 2012-03-24, 03:27
+
Scott Carey 2012-03-26, 15:55
+
Russell Jurney 2012-03-24, 02:01
Copy link to this message
-
Re: Problem: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 64 / avro.io.SchemaResolutionException: Can't access branch index 64 for union with 2 branches / `read_data': Writer's schema and Reader's schema ["string","null"] do not match.

It appears to be reading a union index and failing in there somehow.  If it
did not have any of the pig AvroStorage stuff in there I could tell you
more.

What does avro-tools.jar 's 'tojson' tool do?  (java ­jar
avro-tools-1.6.3.jar tojson <file> | your_favorite_text_reader)
What version of Avro is the java stack trace below?
On 3/23/12 7:01 PM, "Russell Jurney" <[EMAIL PROTECTED]> wrote:

> I have a problem record I've written in Avro that crashes anything which tries
> to read it :(
>
> Can anyone make sense of these errors?
>
> The exception in Pig/AvroStorage is this:
>
>> java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 64
>> at
>> org.apache.pig.piggybank.storage.avro.AvroStorage.getNext(AvroStorage.java:27
>> 5)
>> at
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.
>> nextKeyValue(PigRecordReader.java:187)
>> at
>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask
>> .java:532)
>> at org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
>> at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
>> at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
>> at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206)
>> at
>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
>> at
>> org.apache.pig.piggybank.storage.avro.PigAvroDatumReader.readRecord(PigAvroDa
>> tumReader.java:67)
>> at
>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
>> at
>> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
>> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233)
>> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:220)
>> at
>> org.apache.pig.piggybank.storage.avro.PigAvroRecordReader.getCurrentValue(Pig
>> AvroRecordReader.java:80)
>> at
>> org.apache.pig.piggybank.storage.avro.AvroStorage.getNext(AvroStorage.java:27
>> 3)
>> ... 7 more
>
> When reading the record in Python:
>
>> File "/me/Collecting-Data/src/python/cat_avro", line 21, in <module>
>>     for record in df_reader:
>>   File
>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/si
>> te-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/datafile.py", line 354, in
>> next
>>     datum = self.datum_reader.read(self.datum_decoder)
>>   File
>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/si
>> te-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", line 445, in read
>>     return self.read_data(self.writers_schema, self.readers_schema, decoder)
>>   File
>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/si
>> te-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", line 490, in read_data
>>     return self.read_record(writers_schema, readers_schema, decoder)
>>   File
>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/si
>> te-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", line 690, in
>> read_record
>>     field_val = self.read_data(field.type, readers_field.type, decoder)
>>   File
>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/si
>> te-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", line 488, in read_data
>>     return self.read_union(writers_schema, readers_schema, decoder)
>>   File
>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/si
>> te-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", line 650, in
>> read_union
>>     raise SchemaResolutionException(fail_msg, writers_schema, readers_schema)
>> avro.io.SchemaResolutionException: Can't access branch index 64 for union
>> with 2 branches
>
> When reading the record in Ruby:
>
>> /Users/peyomp/.rvm/gems/ruby-1.8.7-p352/gems/avro-1.6.1/lib/avro/io.rb:298:in
>> `read_data': Writer's schema  and Reader's schema ["string","null"] do not
>> match. (Avro::IO::SchemaMatchException)
>
+
Russell Jurney 2012-03-24, 03:01
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB