Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Re: Using matches in generate clause?


+
Alan Gates 2012-09-27, 16:38
+
James Kebinger 2012-09-28, 21:52
+
pablomar 2012-09-27, 17:34
Copy link to this message
-
Re: Using matches in generate clause?
In Pig 0.9 boolean was not yet a first class data type, so boolean types were not allowed in foreach statements.  In Pig 0.10 boolean became a first class type, so expressions that return booleans (such as matches) should work.

Alan.
On Sep 27, 2012, at 10:34 AM, pablomar wrote:

> no idea why, but matches works with FILTER but it doesn't with FOREACH
> I've tried with pig 0.9.2
>
> example (this works):
> b = filter html_pages by html matches 'some pattern';
>
>
> if you still want to do it with foreach, you can write your UDF, something
> like:
>
> public class MyMatch extends EvalFunc <Boolean>
> {
>  public Boolean exec(Tuple input) throws IOException
>  {
>    try
>    {
>      String pattern = (String)input.get(0);
>      String value = (String)input.get(1);
>
>      return value.matches(pattern);
>    }
>    catch(Exception e)
>    {
>      throw WrappedIOException.wrap("ouch!", e);
>    }
>  }
> }
>
>
> and use it just like this:
>
> b = foreach html_pages generate portal_id, MyMatch('some pattern', html) as
> wp_match;
>
>
>
>
> On Thu, Sep 27, 2012 at 12:38 PM, Alan Gates <[EMAIL PROTECTED]> wrote:
>
>> What version of Pig are you using?
>>
>> Alan.
>>
>> On Sep 27, 2012, at 8:54 AM, James Kebinger wrote:
>>
>>> Hello, I'm having some trouble doing something I thought would be easy:
>> I'd
>>> like to use matches to generate a boolean flag but this seems to not
>>> compile:
>>>
>>> FOREACH html_pages GENERATE portal_id, html matches 'some pattern' as
>>> wp_match:boolean;
>>>
>>> I've tried wrapping it in parens too, with no luck.
>>>
>>> Is this possible, or am I out of luck?
>>>
>>> thanks
>>
>>
+
Dmitriy Ryaboy 2012-09-27, 19:31
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB