Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Trouble with REGEX in PIG


Copy link to this message
-
Re: Trouble with REGEX in PIG
R u planning to use

org.apache.pig.builtin.REGEX_EXTRACT
?

On 12/4/13 9:28 AM, "Watrous, Daniel" <[EMAIL PROTECTED]> wrote:

>Hi,
>
>I'm trying to use regular expressions in PIG, but it's failing. Based on
>the documentation
>http://pig.apache.org/docs/r0.12.0/func.html#regex-extract I am trying
>this:
>
>[watrous@c0003913 ~]$ pig -x local
>which: no hadoop in
>(/opt/krb5/sbin/64:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr
>/local/sbin:/usr/sbin:/sbin:/usr/X11R6/bin:/sbin:/usr/sbin:/usr/bin:/opt/p
>b/bin:/opt/perf/bin:/bin:/usr/local/bin:/home/watrous/bin:/home/watrous/pi
>g-0.12.0/bin)
>2013-12-04 17:15:15,398 [main] INFO  org.apache.pig.Main - Apache Pig
>version 0.12.0 (r1529718) compiled Oct 07 2013, 12:20:14
>2013-12-04 17:15:15,398 [main] INFO  org.apache.pig.Main - Logging error
>messages to: /home/watrous/pig_1386177315394.log
>2013-12-04 17:15:15,425 [main] INFO  org.apache.pig.impl.util.Utils -
>Default bootup file /home/watrous/.pigbootup not found
>2013-12-04 17:15:15,599 [main] INFO
>org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
>Connecting to hadoop file system at: file:///
>grunt> REGEX_EXTRACT('192.168.1.5:8020', '(.*):(.*)', 1);
>2013-12-04 17:16:59,753 [main] ERROR org.apache.pig.tools.grunt.Grunt -
>ERROR 1200: <line 1> Cannot expand macro 'REGEX_EXTRACT'. Reason: Macro
>must be defined before expansion.
>Details at logfile: /home/watrous/pig_1386177315394.log
>
>Here's the relevant bit from the log file:
>Pig Stack Trace
>---------------
>ERROR 1200: <line 1> Cannot expand macro 'REGEX_EXTRACT'. Reason: Macro
>must be defined before expansion.
>
>Failed to parse: <line 1> Cannot expand macro 'REGEX_EXTRACT'. Reason:
>Macro must be defined before expansion.
>        at org.apache.pig.parser.PigMacro.macroInline(PigMacro.java:455)
>        at
>org.apache.pig.parser.QueryParserDriver.inlineMacro(QueryParserDriver.java
>:298)
>        at
>org.apache.pig.parser.QueryParserDriver.expandMacro(QueryParserDriver.java
>:287)
>        at
>org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:180)
>        at
>org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1648)
>        at
>org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1621)
>        at org.apache.pig.PigServer.registerQuery(PigServer.java:575)
>        at
>org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1093)
>        at
>org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParse
>r.java:501)
>        at
>org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:1
>98)
>        at
>org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:1
>73)
>        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
>        at org.apache.pig.Main.run(Main.java:541)
>        at org.apache.pig.Main.main(Main.java:156)
>
>I attempted to define the macro (following this tutorial
>http://aws.amazon.com/articles/2729). However, piggybank.jar doesn't
>define org.apache.pig.piggybank.evaluation.string.EXTRACT, so I located
>the most likely file in the current version of the jar.
>
>grunt> register
>/home/watrous/pig-0.12.0/contrib/piggybank/java/piggybank.jar
>grunt> DEFINE REGEX_EXTRACT
>org.apache.pig.piggybank.evaluation.string.RegexExtract;
>grunt> REGEX_EXTRACT('192.168.1.5:8020', '(.*):(.*)', 1);
>2013-12-04 17:23:20,383 [main] ERROR org.apache.pig.tools.grunt.Grunt -
>ERROR 1200: <line 3> Cannot expand macro 'REGEX_EXTRACT'. Reason: Macro
>must be defined before expansion.
>Details at logfile: /home/watrous/pig_1386177315394.log
>
>I get the same stack trace with the only change being a reference to
><line 3> instead of <line 1>.
>
>Any idea how I can get this working?
>
>Daniel
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB