Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Pig Regex Help


+
John Meek 2013-03-10, 02:57
Copy link to this message
-
Re: Pig Regex Help
Hi John,
     I ran these in pig 0.9.2
     A = LOAD 'data' as line:chararray;
     B = FOREACH A GENERATE FLATTEN(REGEX_EXTRACT_ALL(line, '^(.+?)\\-(.+?)\\s(.+?)\\-(.)(.)\\s(.+)$')) AS (Field1:CHARARRAY,Field2:CHARARRAY,Date:CHARARRAY,Field3:CHARARRAY,Field4:CHARARRAY,Field5:CHARARRAY);
 dump B;
gives me following
(a02s6pq0s1t,dl,20130106,U,X,32)
(johnm,dl,20130106,D,X,32)
which version of pig you are running.
--
Harsha
On Saturday, March 9, 2013 at 6:57 PM, John Meek wrote:

> B = FOREACH A GENERATE FLATTEN(
> REGEX_EXTRACT_ALL(line, '^(.+?)\\-(.+?)\\s(.+?)\\-(.)(.)\\s(.+)$')) AS (Field1:CHARARRAY,Field2:CHARARRAY,Date:CHARARRAY,Field3:CHARARRAY,Field4:CHARARRAY,Field5:CHARARRAY);
>

+
John Meek 2013-03-10, 14:38
+
John Meek 2013-03-10, 04:03
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB