Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Parsing issue within UDF arguments


Copy link to this message
-
Parsing issue within UDF arguments
I was trying to use REGEX_EXTRACT_ALL and seems like if the argument
contains semi-colon, the script errors out.

______________________________________________________________
Data File contents:

cat data;
foo=bar;sessionIdHash=123123123;alice=bob

Script:

a = load 'data1' using PigStorage() as (aa:chararray);

b = foreach a generate REGEX_EXTRACT_ALL(aa,'.*sessionIdHash=(.*);');

dump b;

_________________________________________________________________

Pig confuses semi-colon within the UDF argument. Here is the stack-trace.
Pig Stack Trace
---------------
ERROR 1200: <file test.pig, line 5, column 0>  mismatched character '<EOF>'
expecting '''

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error
during parsing. <file test.pig, line 5, column 0>  mismatched character
'<EOF>' expecting '''
    at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1610)
    at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1549)
    at org.apache.pig.PigServer.registerQuery(PigServer.java:534)
    at
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:960)
    at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:190)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
    at org.apache.pig.Main.run(Main.java:602)
    at org.apache.pig.Main.main(Main.java:154)
Caused by: Failed to parse: <file test.pig, line 5, column 0>  mismatched
character '<EOF>' expecting '''
    at
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:228)
    at
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:168)
    at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1602)
    ... 9 more
===============================================================================
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB