Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # user >> Re: Problem with Pig AvroStorage, with Avros that work in Ruby and Python


Copy link to this message
-
Re: Problem with Pig AvroStorage, with Avros that work in Ruby and Python
Correction: when I read the file in Python, I get the error below.  It
looks like a unicode problem?  Can one tell Avro how to handle this?

Traceback (most recent call last):
  File "./cat_avro", line 21, in <module>
    for record in df_reader:
  File
"/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/datafile.py",
line 354, in next
    datum = self.datum_reader.read(self.datum_decoder)
  File
"/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py",
line 445, in read
    return self.read_data(self.writers_schema, self.readers_schema, decoder)
  File
"/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py",
line 490, in read_data
    return self.read_record(writers_schema, readers_schema, decoder)
  File
"/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py",
line 690, in read_record
    field_val = self.read_data(field.type, readers_field.type, decoder)
  File
"/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py",
line 488, in read_data
    return self.read_union(writers_schema, readers_schema, decoder)
  File
"/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py",
line 654, in read_union
    return self.read_data(selected_writers_schema, readers_schema, decoder)
  File
"/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py",
line 458, in read_data
    return self.read_data(writers_schema, s, decoder)
  File
"/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py",
line 468, in read_data
    return decoder.read_utf8()
  File
"/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py",
line 233, in read_utf8
    return unicode(self.read_bytes(), "utf-8")
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa0 in position 543:
invalid start byte
On Thu, Feb 2, 2012 at 2:06 PM, Russell Jurney <[EMAIL PROTECTED]>wrote:

> I am writing Avro records in Ruby using the avro ruby gem in 1.8.7.  I
> have problems with loading these files sometimes.  As a result, I am unable
> to write large files that are readable.
>
> The exception I get is below.  Anyone have an idea what this means?  It
> looks like Avro is having trouble parsing the schema.  The avro files parse
> in Ruby and Python, just not Pig.  Are there more rigorous checks in Java?
>
> Pig Stack Trace
> ---------------
> ERROR 2998: Unhandled internal error.
> org.codehaus.jackson.JsonFactory.enable(Lorg/codehaus/jackson/JsonParser$Feature;)Lorg/codehaus/jackson/JsonFactory;
>
> java.lang.NoSuchMethodError:
> org.codehaus.jackson.JsonFactory.enable(Lorg/codehaus/jackson/JsonParser$Feature;)Lorg/codehaus/jackson/JsonFactory;
> at org.apache.avro.Schema.<clinit>(Schema.java:82)
>  at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.<clinit>(AvroStorageUtils.java:49)
> at
> org.apache.pig.piggybank.storage.avro.AvroStorage.getAvroSchema(AvroStorage.java:163)
>  at
> org.apache.pig.piggybank.storage.avro.AvroStorage.getAvroSchema(AvroStorage.java:144)
> at
> org.apache.pig.piggybank.storage.avro.AvroStorage.getSchema(AvroStorage.java:269)
>  at
> org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:150)
> at
> org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:109)
>  at
> org.apache.pig.newplan.logical.visitor.LineageFindRelVisitor.visit(LineageFindRelVisitor.java:100)
> at org.apache.pig.newplan.logical.relational.LOLoad.accept(LOLoad.java:218)
>  at
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)

Russell Jurney
twitter.com/rjurney
[EMAIL PROTECTED]
datasyndrome.com
+
James Baldassari 2012-02-02, 22:41
+
Russell Jurney 2012-02-02, 22:48
+
Russell Jurney 2012-02-02, 22:49
+
Russell Jurney 2012-02-02, 22:53
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB