Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Load multiple files with date variables in pig


Copy link to this message
-
Re: Load multiple files with date variables in pig
In that case you'll need to write some code external to your script that
can generate all possible globbing patterns and pass that pattern into your
pig script. So instead of START and END you get something like this:

DATE_PATTERN={2011/12/{29,30,31},2012/01/{01,02,03,04}}

Yes, it's clunky but it's how HDFS handles path globbing.

On Thu, Sep 27, 2012 at 3:49 PM, Jerry Jiang <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I am new to pig.
>
> In pig, I want to load multiple files with date variables at their names.
>
> If I load files between 2012/02/12 to 2012/02/19, the following works
>
> $START = "12"
> $END = "19"
> raw_data = load '/table/status/2012/02/{$START,$END}' using Loader()
>
> Suppose the start date is 2011/12/29 and end date is 2012/01/04, how do I
> change the line of code?
>
> Thanks for any help!
>
> Jerry
>

--
*Note that I'm no longer using my Yahoo! email address. Please email me at
[EMAIL PROTECTED] going forward.*