Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - How to access to the tuple items of REGEX_EXTRACT_ALL ?


+
brice lecomte 2013-02-27, 16:49
Copy link to this message
-
Re: How to access to the tuple items of REGEX_EXTRACT_ALL ?
Johnny Zhang 2013-02-27, 19:26
Hi, Brice:
Instead of save&reload it, can you try 'dump c;' first then use c.$0 ?

Johnny
On Wed, Feb 27, 2013 at 8:49 AM, brice lecomte <[EMAIL PROTECTED]> wrote:

> Hello,
> --Pig 0.10.0--
> I'd like to access straitght forward to the result of:
> grunt> c = foreach logs  generate REGEX_EXTRACT_ALL(f1, '([a-zA-Z]{3,3})
> ([0-9]{1,2}) ([0-2]{1}[0-9]{1}:[0-5]{1}[0-9]{1}:[0-5]{1}[0-9]{1})
> ([a-zA-Z0-9-_]+) ([a-zA-Z]+)\\[[0-9]+\\]: (.*)');
> grunt> illustrate c;
>
>
> -------------------------------------------------------------------------------------------------------------
> | logs     |
> f1:chararray
> |
>
> -------------------------------------------------------------------------------------------------------------
> |          | Feb 24 20:09:01 hadoop-master CRON[3574]:
> pam_unix(cron:session): session closed for user root |
>
> -------------------------------------------------------------------------------------------------------------
>
> ----------------------------------------------------------------------------
> | c     | org.apache.pig.builtin.regex_extract_all_f1_178:tuple()
>  |
>
> ----------------------------------------------------------------------------
> |       | (Feb, ..., pam_unix(cron:session): session closed for user root)
> |
>
> ----------------------------------------------------------------------------
>
> but the only way I found is to save&reload it:
>
> grunt> store c into 'pig/AUTH.result';
> grunt> auth = LOAD 'pig/AUTH.result/part-m-00000' USING PigStorage(',')
> AS (m:chararray, d:int, time:chararray, hostname:chararray,
> service:chararray, info:chararray);
> grunt> day_frequency = GROUP auth by (d,service);
> ...
>
> is there a way to name the tuple items or to access them such as c.$0 or
> FLATTEN(c).$0.... ??
>
> Thanks,
> Brice
>
>
+
brice lecomte 2013-02-28, 10:27
+
brice lecomte 2013-02-28, 14:48