|
|
Eli Finkelshteyn 2012-02-16, 17:50
Hi, I'm trying to do a pretty simple regex test in PIG right now and getting a weird error. All I'm doing is:
orig_set = load '/data/dictionaries/Eng-Spa.dic' USING PigStorage('\t') AS (orig: CHARARRAY, trans: CHARARRAY); filtered = FILTER orig_set BY REGEX_EXTRACT(orig, '^[\\#\\<]') == 1;
The error I get is: 2012-02-16 12:45:24,000 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: Could not infer the matching function for org.apache.pig.builtin.REGEX_EXTRACT as multiple or none of them fit. Please use an explicit cast
Ideas?
Cheers, Eli
Grig Gheorghiu 2012-02-16, 17:54
Can you try with RegexMatch? I am doing something similar in one of my scripts and it works fine.
Grig
On Thu, Feb 16, 2012 at 9:50 AM, Eli Finkelshteyn <[EMAIL PROTECTED]> wrote: > Hi, > I'm trying to do a pretty simple regex test in PIG right now and getting a > weird error. All I'm doing is: > > orig_set = load '/data/dictionaries/Eng-Spa.dic' USING PigStorage('\t') AS > (orig: CHARARRAY, trans: CHARARRAY); > filtered = FILTER orig_set BY REGEX_EXTRACT(orig, '^[\\#\\<]') == 1; > > The error I get is: > 2012-02-16 12:45:24,000 [main] ERROR org.apache.pig.tools.grunt.Grunt - > ERROR 1045: Could not infer the matching function for > org.apache.pig.builtin.REGEX_EXTRACT as multiple or none of them fit. Please > use an explicit cast > > Ideas? > > Cheers, > Eli
Eli Finkelshteyn 2012-02-16, 18:08
Cool, actually, I just got what I wanted to work like this:
filtered = FILTER orig_set BY orig MATCHES '^[\\#\\<].*';
I didn't know MATCHES worked for regex before. Sweet!
Eli
On 2/16/12 12:54 PM, Grig Gheorghiu wrote: > Can you try with RegexMatch? I am doing something similar in one of my > scripts and it works fine. > > Grig > > On Thu, Feb 16, 2012 at 9:50 AM, Eli Finkelshteyn<[EMAIL PROTECTED]> wrote: >> Hi, >> I'm trying to do a pretty simple regex test in PIG right now and getting a >> weird error. All I'm doing is: >> >> orig_set = load '/data/dictionaries/Eng-Spa.dic' USING PigStorage('\t') AS >> (orig: CHARARRAY, trans: CHARARRAY); >> filtered = FILTER orig_set BY REGEX_EXTRACT(orig, '^[\\#\\<]') == 1; >> >> The error I get is: >> 2012-02-16 12:45:24,000 [main] ERROR org.apache.pig.tools.grunt.Grunt - >> ERROR 1045: Could not infer the matching function for >> org.apache.pig.builtin.REGEX_EXTRACT as multiple or none of them fit. Please >> use an explicit cast >> >> Ideas? >> >> Cheers, >> Eli
|
|