Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> How to access to the tuple items of REGEX_EXTRACT_ALL ?


Copy link to this message
-
Re: How to access to the tuple items of REGEX_EXTRACT_ALL ?
Hi, Brice:
Instead of save&reload it, can you try 'dump c;' first then use c.$0 ?

Johnny
On Wed, Feb 27, 2013 at 8:49 AM, brice lecomte <[EMAIL PROTECTED]> wrote:

> Hello,
> --Pig 0.10.0--
> I'd like to access straitght forward to the result of:
> grunt> c = foreach logs  generate REGEX_EXTRACT_ALL(f1, '([a-zA-Z]{3,3})
> ([0-9]{1,2}) ([0-2]{1}[0-9]{1}:[0-5]{1}[0-9]{1}:[0-5]{1}[0-9]{1})
> ([a-zA-Z0-9-_]+) ([a-zA-Z]+)\\[[0-9]+\\]: (.*)');
> grunt> illustrate c;
>
>
> -------------------------------------------------------------------------------------------------------------
> | logs     |
> f1:chararray
> |
>
> -------------------------------------------------------------------------------------------------------------
> |          | Feb 24 20:09:01 hadoop-master CRON[3574]:
> pam_unix(cron:session): session closed for user root |
>
> -------------------------------------------------------------------------------------------------------------
>
> ----------------------------------------------------------------------------
> | c     | org.apache.pig.builtin.regex_extract_all_f1_178:tuple()
>  |
>
> ----------------------------------------------------------------------------
> |       | (Feb, ..., pam_unix(cron:session): session closed for user root)
> |
>
> ----------------------------------------------------------------------------
>
> but the only way I found is to save&reload it:
>
> grunt> store c into 'pig/AUTH.result';
> grunt> auth = LOAD 'pig/AUTH.result/part-m-00000' USING PigStorage(',')
> AS (m:chararray, d:int, time:chararray, hostname:chararray,
> service:chararray, info:chararray);
> grunt> day_frequency = GROUP auth by (d,service);
> ...
>
> is there a way to name the tuple items or to access them such as c.$0 or
> FLATTEN(c).$0.... ??
>
> Thanks,
> Brice
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB