Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> AvroStorage can't read multiple files?


Copy link to this message
-
Re: AvroStorage can't read multiple files?
I checked there is no extra files, all are named "****.avro"

the reason must be at least due to file names , cuz all of them had "#"
char in the name. I did this test:

-- a = LOAD 'abb#b.avro' USING
org.apache.pig.piggybank.storage.avro.AvroStorage();
a = LOAD 'aaa.avro' USING
org.apache.pig.piggybank.storage.avro.AvroStorage();

among the above 2, the first doesn't work, though these 2 files are the
same content. if I try LOAD "abb*" it doesn't work either

On Tue, Oct 1, 2013 at 8:05 AM, Serega Sheypak <[EMAIL PROTECTED]>wrote:

> Looks like you have corrupted avro files / NOT avro files inside catalog.
> Or your files have different schema.
>
> Try to read about AvroStorage and run it using debug key (see doc
> https://cwiki.apache.org/confluence/display/PIG/AvroStorage). It sohuld
> help you to get the idea where things go wrong.
>
> lobal Parameters
>
>    - debug n
>    Users can check debug information using this option where *n* represents
>    the debug level:
>    1. Show names of some function calls when *n>=3*;
>    2. Show details when *n>=5*.
>    -
>
>
>
>
>
> 2013/10/1 Yang <[EMAIL PROTECTED]>
>
> > I had this code:
> >
> > a = load 'myfile.avro' USING AvrotStorage();
> > ....
> >
> > it works fine.
> >
> > but after I change 'myfile' to 'mydirectory/path/' or
> 'mydirectory/path/*',
> > it loads nothing, and gave an error :"Schema for a unknown."
> >
> > I guess the AvroStorage() does not provide the ability to load multiple
> > files? ----- I remember the default PigStorage does allow file globs to
> be
> > loaded
> >
> > thanks!
> > yang
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB