Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Reading multiple files of a directory using a Single LOAD Command in PIG


+
Mix Nin 2013-06-11, 21:26
+
Alan Crosswell 2013-06-11, 21:27
+
Prashant Kommireddi 2013-06-11, 21:32
Copy link to this message
-
Re: Reading multiple files of a directory using a Single LOAD Command in PIG
Hi,

My mistake, I gave backward slashes and so was getting error. I gave
forward slashes and it is working fine.

Good to know that LOAD ignores filenames that begin with "_" or a period
".". So , in that case can I directly give LOAD /Output/* instead of   LOAD
 /Output/part-m*?

Thanks
On Tue, Jun 11, 2013 at 2:32 PM, Prashant Kommireddi <[EMAIL PROTECTED]>wrote:

> What is the error?
>
> The LoadFunc should be ignoring any filenames that begin with "_" or a
> period "."
> If you are trying to skip the _SUCCESS file, the loader you are using
> (PigStorage) already handles that.
>
> Also, can you double check your path is not "/Output/part-m* as opposed to
> backward slashes?
>
>
> On Tue, Jun 11, 2013 at 2:26 PM, Mix Nin <[EMAIL PROTECTED]> wrote:
>
> > I have a directory "Output2. It has file names as below
> >
> > -----------------
> > _SUCCESS
> > part-m-00000
> > part-m-00001
> > part-m-00002
> > part-m-00003
> > .
> > .
> > .
> > .
> > part-m-00100
> > -----------------
> >
> > The above files are produced by PIG output STORE command .
> >
> > I want to read the files starting with "part-m-" using PIG command
> >
> > When I tried using Data= LOAD '\Output2\part-m-*' AS ( );
> > It does not work and it throws error.
> >
> > How do I read these files in a single LOAD statement?
> >
> > Thanks
> >
> >
>
+
Harsh J 2013-06-12, 03:15