Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Possible Pig 9.1 globing bug in parameter substitution


Copy link to this message
-
Re: Possible Pig 9.1 globing bug in parameter substitution
I've located my problem.  It was a difference I believe with the classpath
from 0.9.0 and 0.9.1.  It might be somewhat machine dependent as a lot of
these jars are probably found dynamically via the /bin/pig script which has
changed quite a bit from 0.9.0.  When debugging it looked like
GenericOptionsParser was the culprit so maybe the classpath differences
caused a different version of this class to get loaded.

Anyway the short of it is I have to escape the asterisk * character in my
globbing pattern.

=== Lets do it with 0.9.0 ==
$ /usr/lib/pig-0.9.0/bin/pig -d INFO -p
in_file='/chukwa/repos/Insight-Demo/' -p process_glob='20111226/*/*/*.evt'
-p out_file='dashboard-daily-2011-12-26' -p
in_file1='dashboard-daily-2011-12-26' -p
out_file1='dashboard-daily-2011-12-26' -p current_date_num='20111226' -p
timeperiod='1' ap.pig

*(system.out.println()s added for effect)*
0.9.0 java.class.path  /etc/hbase:/usr/lib/pig-0.9.0/bin/../conf:/usr/java/default/lib/tools.jar:/usr/lib/pig-0.9.0/bin/../build/classes:/usr/lib/pig-0.9.0/bin/../build/test/classes:/usr/lib/pig-0.9.0/bin/../pig-0.9.0-core.jar:/usr/lib/pig-0.9.0/bin/../build/pig-0.9.1-SNAPSHOT.jar:/usr/lib/pig-0.9.0/bin/../lib/automaton.jar:/etc/hadoop/conf:/usr/lib/hadoop/hadoop-core-0.20.2-cdh3u2.jar:/usr/lib/hadoop/lib/hadoop-lzo-0.4.9.jar

Parameter found: in_file=/chukwa/repos/Insight-Demo/
Parameter found: process_glob=20111226/*/*/*.evt
Parameter found: out_file=dashboard-daily-2011-12-26
Parameter found: in_file1=dashboard-daily-2011-12-26
Parameter found: out_file1=dashboard-daily-2011-12-26
Parameter found: current_date_num=20111226
Parameter found: timeperiod=1

=== now with 0.9.1 ==
$ /usr/lib/pig-0.9.1/bin/pig -d INFO -p
in_file='/chukwa/repos/Insight-Demo/' -p process_glob='20111226/*/*/*.evt'
-p out_file='dashboard-daily-2011-12-26' -p
in_file1='dashboard-daily-2011-12-26' -p
out_file1='dashboard-daily-2011-12-26' -p current_date_num='20111226' -p
timeperiod='1' ap.pig

*(system.out.println()s added for effect)*
0.9.1 java.class.path /usr/lib/hadoop-0.20/conf:/usr/java/default/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u2.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/cloudera-desktop-plugins-0.3.0.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-capacity-scheduler-0.20.2-cdh3u0-SNAPSHOT.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u2.jar:/usr/lib/hadoop-0.20/lib/hadoop-lzo-0.4.9.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar:/etc/hbase:/usr/lib/pig-0.9.1/bin/../conf:/usr/java/default/lib/tools.jar:/etc/hadoop/conf:/usr/lib/hadoop/hadoop-core-0.20.2-cdh3u2.jar:/usr/lib/hadoop/lib/hadoop-lzo-0.4.9.jar:/usr/lib/pig-0.9.1/bin/../lib/automaton.jar:/usr/lib/pig-0.9.1/bin/../lib/jython-2.5.0.jar:/usr/lib/pig-0.9.1/bin/../pig-withouthadoop.jar::/usr/local/hbase/hbase-0.90.4.jar:/usr/local/hbase/lib/zookeeper-3.3.2.jar:/usr/local/hbase/conf:/usr/local/hbase/hbase-0.90.4.jar:/usr/local/hbase/lib/zookeeper-3.3.2.jar:/usr/local/hbase/conf

Parameter found: in_file=/chukwa/repos/Insight-Demo/
Parameter found: null
Parameter found: out_file=dashboard-daily-2011-12-26
Parameter found: in_file1=dashboard-daily-2011-12-26
Parameter found: out_file1=dashboard-daily-2011-12-26
Parameter found: current_date_num=20111226
Parameter found: timeperiod=1

The 2nd parameter "process_glob" isn't parsed correctly and needs to be
escaped now like this:

/usr/lib/pig-0.9.1/bin/pig -d INFO -p in_file='/chukwa/repos/Insight-Demo/'
*-p process_glob='20111226/\*/\*/\*.evt' *-p
out_file='dashboard-daily-2011-12-26' -p
in_file1='dashboard-daily-2011-12-26' -p
out_file1='dashboard-daily-2011-12-26' -p current_date_num='20111226' -p
timeperiod='1' ap.pig
On Tue, Dec 27, 2011 at 6:15 PM, Aniket Mokashi <[EMAIL PROTECTED]> wrote:

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB