-Re: Regex expression in FOREACH
praveenesh kumar 2012-02-10, 19:30
No, this is not what I was asking for -
I mean Suppose I have columns names like :
I want to generate all those columns that start with Update ?
If I have small number of columns, I can do this by eyeballing. But if I
have like 100 columns, Its kind of difficult.
In HIVE we can do this, so as in SQL. I want to know is it possible in PIG
also , generating columns using some kind of regex ?
On Fri, Feb 10, 2012 at 11:38 PM, Grig Gheorghiu
> You can use EXTRACT.
> REGISTER file:/home/hadoop/lib/pig/piggybank.jar;
> DEFINE EXTRACT org.apache.pig.piggybank.evaluation.string.EXTRACT();
> Assume relation A contains tuples with a field called key of the form:
> Then you can extract the id field like this:
> B = FOREACH A GENERATE
> EXTRACT(key, 'id=([^\\|]+)[\\|]*')
> AS (
> id: chararray
> Note that each backslash needs to be escaped, hence the \\.
> On Fri, Feb 10, 2012 at 3:22 AM, praveenesh kumar <[EMAIL PROTECTED]>
> > Is it possible to specify regex expressions in FOREACH statement to
> > generate only selected columns as specified by the regex ?
> > Suppose I want to generate only those columns that ends with 'XYZ' , Is
> > possible to do in Pig using some regex?
> > Thanks,
> > Praveenesh