Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Pig Regex Help


Copy link to this message
-
Pig Regex Help
hi,

I m trying to use the following statement in Pig to parse out my data.

B = FOREACH A GENERATE FLATTEN(
REGEX_EXTRACT_ALL(line, '^(.+?)\\-(.+?)\\s(.+?)\\-(.)(.)\\s(.+)$')) AS (Field1:CHARARRAY,Field2:CHARARRAY,Date:CHARARRAY,Field3:CHARARRAY,Field4:CHARARRAY,Field5:CHARARRAY);

The input is basically a file with values in the following format:
a02s6pq0s1t-dl  20130106-UX    32
johnm-dl  20130106-DX    32

I need the output to be 6 columns like below:

a02s6pq0s1t dl  20130106 U X 32
johnm dl  20130106 D X 32

Pig is giving me (). Please help.
John M
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB