Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Way of determining the source of data


Copy link to this message
-
Re: Way of determining the source of data
Check https://cwiki.apache.org/confluence/display/PIG/FAQ#FAQ-Q%3AIloaddatafromadirectorywhichcontainsdifferentfile.HowdoIfindoutwherethedatacomesfrom%3F

On Thu, Feb 2, 2012 at 5:11 PM, Ranjan Bagchi <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I've a bunch of [for example] apache logfiles that I'm searching through.  I can process them with:
>
> logs = load 's3://bucket/directory/*' USING LogLoader as (remoteAddr, remoteLogname, user, time :chararray, method, uri :chararray, proto, status, bytes, referer, userAgent);
>
> Is there any way of getting the name of the file from which logs was pulled added to the relation?
>
> Thanks,
>
> Ranjan