Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> AvroStorage Issue in 0.9.2-cdh4.0.0 -- Schema is unknown


Copy link to this message
-
Re: AvroStorage Issue in 0.9.2-cdh4.0.0 -- Schema is unknown
You can download, unzip and use Pig 0.10 without waiting for a
Cloudera release. It is a client-side tool. AvroStorage didn't work
very well until 0.10, I'm afraid.

Russell Jurney http://datasyndrome.com

On Jun 18, 2012, at 12:58 AM, Markus Resch <[EMAIL PROTECTED]> wrote:

> Hey Russell,
>
> thanks a million for that hint. I tried registering all those jars as
> well but it didn't work. What, from my point of view, is the most
> annoying thing here is the fact that the data storage doesn't complain
> about the missing jars but just works. So the following parts of the
> script fail with misleading messages. I think some warning could be
> handy here.
>
> Thanks for your other hint as well I think we will test pig 0.10 and
> switch over for the productive cluster after its bundled with a cloudera
> stable. We will see.
>
> Best
>
> Markus
>
> Am Freitag, den 15.06.2012, 12:11 -0700 schrieb Russell Jurney:
>> Oh, you maybe also need to load other jars?  I load avro this way.
>>
>> REGISTER /me/pig/build/ivy/lib/Pig/avro-1.5.3.jar
>>
>> REGISTER /me/pig/build/ivy/lib/Pig/json-simple-1.1.jar
>>
>> REGISTER /me/pig/contrib/piggybank/java/piggybank.jar
>>
>>
>> Russell Jurney http://datasyndrome.com
>>
>> On Jun 15, 2012, at 1:50 AM, Markus Resch <[EMAIL PROTECTED]> wrote:
>>
>> Hey all,
>>
>> we're currently testing to switch over from CDH3 to CDH4.
>> When I try to read my Avro input data I get en Schema unknown Error:
>>
>> bash-3.2$ pig
>> 12/06/15 08:48:08 WARN pig.Main: Cannot write to log
>> file: /usr/lib/pig/pig_1339750088923.log
>> 2012-06-15 08:48:09,415 [main] INFO
>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
>> Connecting to hadoop file system at: file:///
>> 2012-06-15 08:48:09,421 [main] WARN
>> org.apache.hadoop.conf.Configuration - fs.default.name is deprecated.
>> Instead, use fs.defaultFS
>> 2012-06-15 08:48:09,747 [main] WARN
>> org.apache.hadoop.conf.Configuration - fs.default.name is deprecated.
>> Instead, use fs.defaultFS
>> grunt> REGISTER /usr/lib/pig/contrib/piggybank/java/piggybank.jar;
>> grunt> dataImport = LOAD '/our/path/to/our/data/data.avro' USING
>> org.apache.pig.piggybank.storage.avro.AvroStorage ();
>> grunt> describe
>> dataImport;
>>
>> Schema for dataImport unknown.
>>
>> Is this somehow a known issue or do I make something wrong?
>>
>> Thanks
>> Markus
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB