Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> AvroStorage can't read multiple files?


+
Yang 2013-10-01, 14:43
+
Serega Sheypak 2013-10-01, 15:05
Copy link to this message
-
Re: AvroStorage can't read multiple files?
I checked there is no extra files, all are named "****.avro"

the reason must be at least due to file names , cuz all of them had "#"
char in the name. I did this test:

-- a = LOAD 'abb#b.avro' USING
org.apache.pig.piggybank.storage.avro.AvroStorage();
a = LOAD 'aaa.avro' USING
org.apache.pig.piggybank.storage.avro.AvroStorage();

among the above 2, the first doesn't work, though these 2 files are the
same content. if I try LOAD "abb*" it doesn't work either

On Tue, Oct 1, 2013 at 8:05 AM, Serega Sheypak <[EMAIL PROTECTED]>wrote:

> Looks like you have corrupted avro files / NOT avro files inside catalog.
> Or your files have different schema.
>
> Try to read about AvroStorage and run it using debug key (see doc
> https://cwiki.apache.org/confluence/display/PIG/AvroStorage). It sohuld
> help you to get the idea where things go wrong.
>
> lobal Parameters
>
>    - debug n
>    Users can check debug information using this option where *n* represents
>    the debug level:
>    1. Show names of some function calls when *n>=3*;
>    2. Show details when *n>=5*.
>    -
>
>
>
>
>
> 2013/10/1 Yang <[EMAIL PROTECTED]>
>
> > I had this code:
> >
> > a = load 'myfile.avro' USING AvrotStorage();
> > ....
> >
> > it works fine.
> >
> > but after I change 'myfile' to 'mydirectory/path/' or
> 'mydirectory/path/*',
> > it loads nothing, and gave an error :"Schema for a unknown."
> >
> > I guess the AvroStorage() does not provide the ability to load multiple
> > files? ----- I remember the default PigStorage does allow file globs to
> be
> > loaded
> >
> > thanks!
> > yang
> >
>
+
Yang 2013-10-01, 17:54