Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Pig Regex Help


Copy link to this message
-
Pig Regex Help
John Meek 2013-03-10, 02:57
hi,

I m trying to use the following statement in Pig to parse out my data.

B = FOREACH A GENERATE FLATTEN(
REGEX_EXTRACT_ALL(line, '^(.+?)\\-(.+?)\\s(.+?)\\-(.)(.)\\s(.+)$')) AS (Field1:CHARARRAY,Field2:CHARARRAY,Date:CHARARRAY,Field3:CHARARRAY,Field4:CHARARRAY,Field5:CHARARRAY);

The input is basically a file with values in the following format:
a02s6pq0s1t-dl  20130106-UX    32
johnm-dl  20130106-DX    32

I need the output to be 6 columns like below:

a02s6pq0s1t dl  20130106 U X 32
johnm dl  20130106 D X 32

Pig is giving me (). Please help.
John M