Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Pig Regex Help


Copy link to this message
-
Re: Pig Regex Help
Harsha, thanks for your response. I needed to use USING PigStorage(',' ) in my load statement. Works now.
 

 

 

-----Original Message-----
From: Harsha <[EMAIL PROTECTED]>
To: user <[EMAIL PROTECTED]>
Sent: Sat, Mar 9, 2013 10:40 pm
Subject: Re: Pig Regex Help
Hi John,
     I ran these in pig 0.9.2
     A = LOAD 'data' as line:chararray;
     B = FOREACH A GENERATE FLATTEN(REGEX_EXTRACT_ALL(line, '^(.+?)\\-(.+?)\\s(.+?)\\-(.)(.)\\s(.+)$'))
AS (Field1:CHARARRAY,Field2:CHARARRAY,Date:CHARARRAY,Field3:CHARARRAY,Field4:CHARARRAY,Field5:CHARARRAY);
 dump B;
gives me following
(a02s6pq0s1t,dl,20130106,U,X,32)
(johnm,dl,20130106,D,X,32)
which version of pig you are running.
--
Harsha
On Saturday, March 9, 2013 at 6:57 PM, John Meek wrote:

> B = FOREACH A GENERATE FLATTEN(
> REGEX_EXTRACT_ALL(line, '^(.+?)\\-(.+?)\\s(.+?)\\-(.)(.)\\s(.+)$')) AS
(Field1:CHARARRAY,Field2:CHARARRAY,Date:CHARARRAY,Field3:CHARARRAY,Field4:CHARARRAY,Field5:CHARARRAY);
>
 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB