|
Michał Czerwiński
2012-11-12, 16:47
Cheolsoo Park
2012-11-12, 17:37
Michał Czerwiński
2012-11-12, 17:59
Cheolsoo Park
2012-11-12, 18:09
Michał Czerwiński
2012-11-12, 18:29
Cheolsoo Park
2012-11-12, 18:45
Michał Czerwiński
2012-11-13, 15:16
Michał Czerwiński
2012-11-13, 15:40
Cheolsoo Park
2012-11-13, 17:18
Michał Czerwiński
2012-11-13, 17:34
|
-
Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udfMichał Czerwiński 2012-11-12, 16:47
I am trying to use pig 0.11 and pig trunk (currently 0.12) because pig 0.10
seems to be having issues with python udf... According to this http://www.mail-archive.com/[EMAIL PROTECTED]/msg05837.html " after replacing pig.jar and pig-withouthadoop.jar with the 0.11 ones from the svn trunk, they work like a charm." Well this is clearly not the case for me... The error I get is: Pig Stack Trace --------------- ERROR 2017: Internal error creating job configuration. org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias ll at org.apache.pig.PigServer.openIterator(PigServer.java:841) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:696) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:320) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) at org.apache.pig.Main.run(Main.java:535) at org.apache.pig.Main.main(Main.java:154) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:197) Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias ll at org.apache.pig.PigServer.storeEx(PigServer.java:940) at org.apache.pig.PigServer.store(PigServer.java:903) at org.apache.pig.PigServer.openIterator(PigServer.java:816) ... 12 more Caused by: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException: ERROR 2017: Internal error creating job configuration. at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:848) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:294) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:177) at org.apache.pig.PigServer.launchPlan(PigServer.java:1269) at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1254) at org.apache.pig.PigServer.storeEx(PigServer.java:936) ... 14 more Caused by: java.lang.IllegalArgumentException: Can not create a Path from an empty string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) at org.apache.hadoop.fs.Path.<init>(Path.java:90) at org.apache.hadoop.fs.Path.<init>(Path.java:45) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.shipToHDFS(JobControlCompiler.java:1455) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.putJarOnClassPathThroughDistributedCache(JobControlCompiler.java:1432) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:508) ... 19 more =============================================================================== I am using the following startup script: export HADOOP_HOME=/usr/lib/hadoop-0.20 export HCAT_HOME=/opt/hcat export HIVE_HOME=/usr/lib/hive PIG_CLASSPATH=$HCAT_HOME/share/hcatalog/hcatalog-0.4.0.jar::$HIVE_HOME/conf:$HADOOP_HOME/conf for file in $HIVE_HOME/lib/*.jar; do echo "==> Adding $file" PIG_CLASSPATH=$PIG_CLASSPATH:$file done export PIG_OPTS=-Dhive.metastore.uris=thrift:// appserver.hadoop.staging.qutics.com:10002 export JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::") exec bin/pig -Dpig.additional.jars=$PIG_CLASSPATH "$@" Any clues? Thank you! +
Michał Czerwiński 2012-11-12, 16:47
-
Re: Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udfCheolsoo Park 2012-11-12, 17:37
Hi Michal,
Caused by: java.lang.IllegalArgumentException: Can not create a Path from an empty string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) at org.apache.hadoop.fs.Path.<init>(Path.java:90) at org.apache.hadoop.fs.Path.<init>(Path.java:45 Your error message indicates that there is a typo somewhere in paths. I believe that your PIG_CLASSPATH is the problem: PIG_CLASSPATH=$HCAT_HOME/share/hcatalog/hcatalog-0.4.0.jar::$HIVE_HOME/conf:$HADOOP_HOME/conf You have a double colon :: in the middle, and that will be interpreted as an empty string. Thanks, Cheolsoo On Mon, Nov 12, 2012 at 8:47 AM, Michał Czerwiński <[EMAIL PROTECTED] > wrote: > I am trying to use pig 0.11 and pig trunk (currently 0.12) because pig 0.10 > seems to be having issues with python udf... > > According to this > http://www.mail-archive.com/[EMAIL PROTECTED]/msg05837.html > > " after replacing pig.jar and pig-withouthadoop.jar with the > 0.11 ones from the svn trunk, they work like a charm." > > Well this is clearly not the case for me... > > The error I get is: > > Pig Stack Trace > --------------- > ERROR 2017: Internal error creating job configuration. > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias ll > at org.apache.pig.PigServer.openIterator(PigServer.java:841) > at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:696) > at > > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:320) > at > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) > at > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170) > at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) > at org.apache.pig.Main.run(Main.java:535) > at org.apache.pig.Main.main(Main.java:154) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:197) > Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias > ll > at org.apache.pig.PigServer.storeEx(PigServer.java:940) > at org.apache.pig.PigServer.store(PigServer.java:903) > at org.apache.pig.PigServer.openIterator(PigServer.java:816) > ... 12 more > Caused by: > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException: > ERROR 2017: Internal error creating job configuration. > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:848) > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:294) > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:177) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1269) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1254) > at org.apache.pig.PigServer.storeEx(PigServer.java:936) > ... 14 more > Caused by: java.lang.IllegalArgumentException: Can not create a Path from > an empty string > at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) > at org.apache.hadoop.fs.Path.<init>(Path.java:90) > at org.apache.hadoop.fs.Path.<init>(Path.java:45) > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.shipToHDFS(JobControlCompiler.java:1455) > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.putJarOnClassPathThroughDistributedCache(JobControlCompiler.java:1432) > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:508) > ... 19 more > > ===============================================================================> > I am using the following startup script: +
Cheolsoo Park 2012-11-12, 17:37
-
Re: Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udfMichał Czerwiński 2012-11-12, 17:59
Nice spot on, but it seems that's not the problem...I am able to run
this successfully on pig 0.9 and 0.10 any ideas how further I can debug the problem? I also double checked all the variables, seems to be fine: PIG_CLASSPATH=/opt/hcat/share/hcatalog/hcatalog-0.4.0.jar:/usr/lib/hive/conf:/usr/lib/hadoop-0.20/conf /usr/lib/hive/conf and /usr/lib/hadoop-0.20/conf also exist. Thanks. On 12 November 2012 17:37, Cheolsoo Park <[EMAIL PROTECTED]> wrote: > Hi Michal, > > Caused by: java.lang.IllegalArgumentException: Can not create a Path > from an empty string > at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) > at org.apache.hadoop.fs.Path.<init>(Path.java:90) > at org.apache.hadoop.fs.Path.<init>(Path.java:45 > > Your error message indicates that there is a typo somewhere in paths. I > believe that your PIG_CLASSPATH is the problem: > > > PIG_CLASSPATH=$HCAT_HOME/share/hcatalog/hcatalog-0.4.0.jar::$HIVE_HOME/conf:$HADOOP_HOME/conf > > You have a double colon :: in the middle, and that will be interpreted as > an empty string. > > Thanks, > Cheolsoo > > On Mon, Nov 12, 2012 at 8:47 AM, Michał Czerwiński < > [EMAIL PROTECTED] > > wrote: > > > I am trying to use pig 0.11 and pig trunk (currently 0.12) because pig > 0.10 > > seems to be having issues with python udf... > > > > According to this > > http://www.mail-archive.com/[EMAIL PROTECTED]/msg05837.html > > > > " after replacing pig.jar and pig-withouthadoop.jar with the > > 0.11 ones from the svn trunk, they work like a charm." > > > > Well this is clearly not the case for me... > > > > The error I get is: > > > > Pig Stack Trace > > --------------- > > ERROR 2017: Internal error creating job configuration. > > > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > > open iterator for alias ll > > at org.apache.pig.PigServer.openIterator(PigServer.java:841) > > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:696) > > at > > > > > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:320) > > at > > > > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) > > at > > > > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170) > > at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) > > at org.apache.pig.Main.run(Main.java:535) > > at org.apache.pig.Main.main(Main.java:154) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > > at > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > > at org.apache.hadoop.util.RunJar.main(RunJar.java:197) > > Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias > > ll > > at org.apache.pig.PigServer.storeEx(PigServer.java:940) > > at org.apache.pig.PigServer.store(PigServer.java:903) > > at org.apache.pig.PigServer.openIterator(PigServer.java:816) > > ... 12 more > > Caused by: > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException: > > ERROR 2017: Internal error creating job configuration. > > at > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:848) > > at > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:294) > > at > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:177) > > at org.apache.pig.PigServer.launchPlan(PigServer.java:1269) > > at > > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1254) > > at org.apache.pig.PigServer.storeEx(PigServer.java:936) > > ... 14 more > > Caused by: java.lang.IllegalArgumentException: Can not create a Path from > > an empty string > > at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) +
Michał Czerwiński 2012-11-12, 17:59
-
Re: Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udfCheolsoo Park 2012-11-12, 18:09
Interesting because I am able to reproduce your error by passing
<jar1>::<jar2> to -Dpig.additional.jar. Setting PIG_CLASSPATH itself with :: seems fine, but passing it to -Dpig.additional.jar is not. If you still run into a failure, can you double-check whether you're seeing the same error or a different one? You shouldn't see the same error. Cheolsoo On Mon, Nov 12, 2012 at 9:59 AM, Michał Czerwiński <[EMAIL PROTECTED] > wrote: > Nice spot on, but it seems that's not the problem...I am able to run > this successfully on pig 0.9 and 0.10 any ideas how further I can debug the > problem? > > I also double checked all the variables, seems to be fine: > > PIG_CLASSPATH=/opt/hcat/share/hcatalog/hcatalog-0.4.0.jar:/usr/lib/hive/conf:/usr/lib/hadoop-0.20/conf > > /usr/lib/hive/conf and /usr/lib/hadoop-0.20/conf also exist. > > Thanks. > > On 12 November 2012 17:37, Cheolsoo Park <[EMAIL PROTECTED]> wrote: > > > Hi Michal, > > > > Caused by: java.lang.IllegalArgumentException: Can not create a Path > > from an empty string > > at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) > > at org.apache.hadoop.fs.Path.<init>(Path.java:90) > > at org.apache.hadoop.fs.Path.<init>(Path.java:45 > > > > Your error message indicates that there is a typo somewhere in paths. I > > believe that your PIG_CLASSPATH is the problem: > > > > > > > PIG_CLASSPATH=$HCAT_HOME/share/hcatalog/hcatalog-0.4.0.jar::$HIVE_HOME/conf:$HADOOP_HOME/conf > > > > You have a double colon :: in the middle, and that will be interpreted as > > an empty string. > > > > Thanks, > > Cheolsoo > > > > On Mon, Nov 12, 2012 at 8:47 AM, Michał Czerwiński < > > [EMAIL PROTECTED] > > > wrote: > > > > > I am trying to use pig 0.11 and pig trunk (currently 0.12) because pig > > 0.10 > > > seems to be having issues with python udf... > > > > > > According to this > > > http://www.mail-archive.com/[EMAIL PROTECTED]/msg05837.html > > > > > > " after replacing pig.jar and pig-withouthadoop.jar with the > > > 0.11 ones from the svn trunk, they work like a charm." > > > > > > Well this is clearly not the case for me... > > > > > > The error I get is: > > > > > > Pig Stack Trace > > > --------------- > > > ERROR 2017: Internal error creating job configuration. > > > > > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable > to > > > open iterator for alias ll > > > at org.apache.pig.PigServer.openIterator(PigServer.java:841) > > > at > > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:696) > > > at > > > > > > > > > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:320) > > > at > > > > > > > > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) > > > at > > > > > > > > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170) > > > at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) > > > at org.apache.pig.Main.run(Main.java:535) > > > at org.apache.pig.Main.main(Main.java:154) > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > at > > > > > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > > > at > > > > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > at org.apache.hadoop.util.RunJar.main(RunJar.java:197) > > > Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store > alias > > > ll > > > at org.apache.pig.PigServer.storeEx(PigServer.java:940) > > > at org.apache.pig.PigServer.store(PigServer.java:903) > > > at org.apache.pig.PigServer.openIterator(PigServer.java:816) > > > ... 12 more > > > Caused by: > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException: > > > ERROR 2017: Internal error creating job configuration. > > > at > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:848) +
Cheolsoo Park 2012-11-12, 18:09
-
Re: Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udfMichał Czerwiński 2012-11-12, 18:29
Seems like exactly the same error.
I do it like that: > export JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::") which resolves to /usr/lib/jvm/java-6-sun-1.6.0.21/jre/ > bin/pig -Dpig.additional.jars=/opt/hcat/share/hcatalog/hcatalog-0.4.0.jar:/usr/lib/hive/conf:/usr/lib/hadoop-0.20/conf:/usr/lib/hive/lib/ant-contrib-1.0b3.jar:/usr/lib/hive/lib/antlr-runtime-3.0.1.jar:/usr/lib/hive/lib/asm-3.1.jar:/usr/lib/hive/lib/avro-1.5.4.jar:/usr/lib/hive/lib/avro-ipc-1.5.4.jar:/usr/lib/hive/lib/avro-mapred-1.5.4.jar:/usr/lib/hive/lib/commons-cli-1.2.jar:/usr/lib/hive/lib/commons-codec-1.3.jar:/usr/lib/hive/lib/commons-collections-3.2.1.jar:/usr/lib/hive/lib/commons-dbcp-1.4.jar:/usr/lib/hive/lib/commons-lang-2.4.jar:/usr/lib/hive/lib/commons-logging-1.0.4.jar:/usr/lib/hive/lib/commons-logging-api-1.0.4.jar:/usr/lib/hive/lib/commons-pool-1.5.4.jar:/usr/lib/hive/lib/datanucleus-connectionpool-2.0.3.jar:/usr/lib/hive/lib/datanucleus-core-2.0.3-ZD5977-CDH5293.jar:/usr/lib/hive/lib/datanucleus-enhancer-2.0.3.jar:/usr/lib/hive/lib/datanucleus-rdbms-2.0.3.jar:/usr/lib/hive/lib/derby.jar:/usr/lib/hive/lib/guava-r06.jar:/usr/lib/hive/lib/haivvreo-1.0.7-cdh-2.jar:/usr/lib/hive/lib/high-scale-lib-1.1.1.jar:/usr/lib/hive/lib/hive-anttasks-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-cli-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-common-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-contrib-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-exec-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-hbase-handler-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-jdbc-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-metastore-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-serde-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-service-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-shims-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/jackson-core-asl-1.7.3.jar:/usr/lib/hive/lib/jackson-jaxrs-1.7.3.jar:/usr/lib/hive/lib/jackson-mapper-asl-1.7.3.jar:/usr/lib/hive/lib/jackson-xc-1.7.3.jar:/usr/lib/hive/lib/jdo2-api-2.3-ec.jar:/usr/lib/hive/lib/jline-0.9.94.jar:/usr/lib/hive/lib/json.jar:/usr/lib/hive/lib/junit-3.8.1.jar:/usr/lib/hive/lib/libfb303.jar:/usr/lib/hive/lib/libthrift.jar:/usr/lib/hive/lib/log4j-1.2.15.jar:/usr/lib/hive/lib/slf4j-api-1.6.1.jar:/usr/lib/hive/lib/slf4j-log4j12-1.6.1.jar:/usr/lib/hive/lib/snappy-java-1.0.3.2.jar:/usr/lib/hive/lib/stringtemplate-3.1b1.jar:/usr/lib/hive/lib/thrift-0.5.0.jar:/usr/lib/hive/lib/thrift-fb303-0.5.0.jar:/usr/lib/hive/lib/velocity-1.5.jar which is basically echo bin/pig -Dpig.additional.jars="$PIG_CLASSPATH" "$@" grunt> A = load 'xxx.yyy' using org.apache.hcatalog.pig.HCatLoader; 2012-11-12 18:22:38,398 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://hcatalog:10002 2012-11-12 18:22:38,506 [main] INFO hive.metastore - Connected to metastore. grunt> B = FILTER A BY keyword=='FU'; grunt> ll = LIMIT B 10; grunt> dump ll; 012-11-12 18:22:47,567 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: FILTER,LIMIT 2012-11-12 18:22:47,901 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false 2012-11-12 18:22:48,026 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 2 2012-11-12 18:22:48,026 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 2 2012-11-12 18:22:48,249 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job 2012-11-12 18:22:48,333 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 2012-11-12 18:22:48,629 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://hcatalog:10002 2012-11-12 18:22:48,630 [main] INFO hive.metastore - Connected to metastore. 2012-11-12 18:22:49,476 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1 2012-11-12 18:22:49,705 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2017: Internal error creating job configuration. Details at logfile: /opt/pig/trunk/pig_1352744549197.log The content of the above mentioned file is: Pig Stack Trace ERROR 2017: Internal error creating job configuration. org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias ll at org.apache.pig.PigServer.openIterator(PigServer.java:841) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:696) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:320) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) at org.apache.pig.Main.run(Main.java:535) at org.apache.pig.Main.main(Main.java:154) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:197) Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias ll at org.apache.pig.PigServer.storeEx(PigServer.java:940) at org.apache.pig.PigServer.store(PigServer.java:903) at org.apache.pig.PigServer.openIterator(PigServer.java:816) ... 12 more Caused by: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException: ERROR 2017: Internal error creating job configuration. at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:848) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(Jo +
Michał Czerwiński 2012-11-12, 18:29
-
Re: Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udfCheolsoo Park 2012-11-12, 18:45
Can you try to print out debug message by adding "-d DEBUG" to the Pig
command? It will print which additional files are added to distributed cache as follows: 2012-11-12 10:41:58,908 [main] DEBUG org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Adding jar to DistributedCache: file:/home/cheolsoo/apache-ant-1.8.4/lib/ant-antlr.jar 2012-11-12 10:41:59,099 [main] DEBUG org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Adding jar to DistributedCache: file:/etc/hadoop-0.20/conf.pseudo/ This will tell you which file it was shipping right before failed. That will probably give you a hint on where to look into further. Thanks, Cheolsoo On Mon, Nov 12, 2012 at 10:29 AM, Michał Czerwiński < [EMAIL PROTECTED]> wrote: > Seems like exactly the same error. > > I do it like that: > > > export JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::") > which resolves to /usr/lib/jvm/java-6-sun-1.6.0.21/jre/ > > > bin/pig > > -Dpig.additional.jars=/opt/hcat/share/hcatalog/hcatalog-0.4.0.jar:/usr/lib/hive/conf:/usr/lib/hadoop-0.20/conf:/usr/lib/hive/lib/ant-contrib-1.0b3.jar:/usr/lib/hive/lib/antlr-runtime-3.0.1.jar:/usr/lib/hive/lib/asm-3.1.jar:/usr/lib/hive/lib/avro-1.5.4.jar:/usr/lib/hive/lib/avro-ipc-1.5.4.jar:/usr/lib/hive/lib/avro-mapred-1.5.4.jar:/usr/lib/hive/lib/commons-cli-1.2.jar:/usr/lib/hive/lib/commons-codec-1.3.jar:/usr/lib/hive/lib/commons-collections-3.2.1.jar:/usr/lib/hive/lib/commons-dbcp-1.4.jar:/usr/lib/hive/lib/commons-lang-2.4.jar:/usr/lib/hive/lib/commons-logging-1.0.4.jar:/usr/lib/hive/lib/commons-logging-api-1.0.4.jar:/usr/lib/hive/lib/commons-pool-1.5.4.jar:/usr/lib/hive/lib/datanucleus-connectionpool-2.0.3.jar:/usr/lib/hive/lib/datanucleus-core-2.0.3-ZD5977-CDH5293.jar:/usr/lib/hive/lib/datanucleus-enhancer-2.0.3.jar:/usr/lib/hive/lib/datanucleus-rdbms-2.0.3.jar:/usr/lib/hive/lib/derby.jar:/usr/lib/hive/lib/guava-r06.jar:/usr/lib/hive/lib/haivvreo-1.0.7-cdh-2.jar:/usr/lib/hive/lib/high-scale-lib-1.1.1.jar:/usr/lib/hive/lib/hive-anttasks-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-cli-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-common-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-contrib-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-exec-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-hbase-handler-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-jdbc-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-metastore-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-serde-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-service-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-shims-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/jackson-core-asl-1.7.3.jar:/usr/lib/hive/lib/jackson-jaxrs-1.7.3.jar:/usr/lib/hive/lib/jackson-mapper-asl-1.7.3.jar:/usr/lib/hive/lib/jackson-xc-1.7.3.jar:/usr/lib/hive/lib/jdo2-api-2.3-ec.jar:/usr/lib/hive/lib/jline-0.9.94.jar:/usr/lib/hive/lib/json.jar:/usr/lib/hive/lib/junit-3.8.1.jar:/usr/lib/hive/lib/libfb303.jar:/usr/lib/hive/lib/libthrift.jar:/usr/lib/hive/lib/log4j-1.2.15.jar:/usr/lib/hive/lib/slf4j-api-1.6.1.jar:/usr/lib/hive/lib/slf4j-log4j12-1.6.1.jar:/usr/lib/hive/lib/snappy-java-1.0.3.2.jar:/usr/lib/hive/lib/stringtemplate-3.1b1.jar:/usr/lib/hive/lib/thrift-0.5.0.jar:/usr/lib/hive/lib/thrift-fb303-0.5.0.jar:/usr/lib/hive/lib/velocity-1.5.jar > > which is basically > echo bin/pig -Dpig.additional.jars="$PIG_CLASSPATH" "$@" > > grunt> A = load 'xxx.yyy' using org.apache.hcatalog.pig.HCatLoader; > 2012-11-12 18:22:38,398 [main] INFO hive.metastore - Trying to connect to > metastore with URI thrift://hcatalog:10002 > 2012-11-12 18:22:38,506 [main] INFO hive.metastore - Connected to > metastore. > grunt> B = FILTER A BY keyword=='FU'; > grunt> ll = LIMIT B 10; > grunt> dump ll; > > 012-11-12 18:22:47,567 [main] INFO > org.apache.pig.tools.pigstats.ScriptState - Pig features used in the > script: FILTER,LIMIT > 2012-11-12 18:22:47,901 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - > File concatenation threshold: 100 optimistic? false > 2012-11-12 18:22:48,026 [main] INFO +
Cheolsoo Park 2012-11-12, 18:45
-
Re: Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udfMichał Czerwiński 2012-11-13, 15:16
Right, it looks like that:
2012-11-13 15:13:57,100 [main] DEBUG org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Adding jar to DistributedCache: file:/opt/hcat/share/hcatalog/hcatalog-0.4.0.jar 2012-11-13 15:13:57,428 [main] DEBUG org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Adding jar to DistributedCache: file:/usr/lib/hive/conf/ 2012-11-13 15:13:57,433 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2017: Internal error creating job configuration. Details at logfile: /opt/pig/trunk/pig_1352819617642.log #> ls -la /usr/lib/hive/conf/ total 88 drwxr-xr-x 2 root root 4096 2012-11-12 17:48 . drwxr-xr-x 8 root root 4096 2012-11-09 17:29 .. -rw-r--r-- 1 root root 39451 2012-11-08 10:24 hive-default.xml -rw-r--r-- 1 root root 1408 2012-11-08 11:22 hive-env.sh -rw-r--r-- 1 root root 1410 2012-11-08 10:24 hive-env.sh.template -rw-r--r-- 1 root root 1637 2012-11-08 10:24 hive-exec-log4j.properties -rw-r--r-- 1 root root 2005 2012-11-08 10:24 hive-log4j.properties -rw-r--r-- 1 root root 4055 2012-11-08 11:22 hive-site-client.xml.tpl -rw-rw-r-- 1 root root 4879 2012-11-09 15:30 hive-site.xml -rw-r--r-- 1 root root 4903 2012-11-09 15:30 hive-site.xml.PIG.tpl -rw-r--r-- 1 root root 3634 2012-11-08 11:22 hive-site.xml.tpl On 12 November 2012 18:45, Cheolsoo Park <[EMAIL PROTECTED]> wrote: > Can you try to print out debug message by adding "-d DEBUG" to the Pig > command? It will print which additional files are added to distributed > cache as follows: > > 2012-11-12 10:41:58,908 [main] DEBUG > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > - Adding jar to DistributedCache: > file:/home/cheolsoo/apache-ant-1.8.4/lib/ant-antlr.jar > 2012-11-12 10:41:59,099 [main] DEBUG > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > - Adding jar to DistributedCache: file:/etc/hadoop-0.20/conf.pseudo/ > > This will tell you which file it was shipping right before failed. That > will probably give you a hint on where to look into further. > > Thanks, > Cheolsoo > > > On Mon, Nov 12, 2012 at 10:29 AM, Michał Czerwiński < > [EMAIL PROTECTED]> wrote: > > > Seems like exactly the same error. > > > > I do it like that: > > > > > export JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::") > > which resolves to /usr/lib/jvm/java-6-sun-1.6.0.21/jre/ > > > > > bin/pig > > > > > -Dpig.additional.jars=/opt/hcat/share/hcatalog/hcatalog-0.4.0.jar:/usr/lib/hive/conf:/usr/lib/hadoop-0.20/conf:/usr/lib/hive/lib/ant-contrib-1.0b3.jar:/usr/lib/hive/lib/antlr-runtime-3.0.1.jar:/usr/lib/hive/lib/asm-3.1.jar:/usr/lib/hive/lib/avro-1.5.4.jar:/usr/lib/hive/lib/avro-ipc-1.5.4.jar:/usr/lib/hive/lib/avro-mapred-1.5.4.jar:/usr/lib/hive/lib/commons-cli-1.2.jar:/usr/lib/hive/lib/commons-codec-1.3.jar:/usr/lib/hive/lib/commons-collections-3.2.1.jar:/usr/lib/hive/lib/commons-dbcp-1.4.jar:/usr/lib/hive/lib/commons-lang-2.4.jar:/usr/lib/hive/lib/commons-logging-1.0.4.jar:/usr/lib/hive/lib/commons-logging-api-1.0.4.jar:/usr/lib/hive/lib/commons-pool-1.5.4.jar:/usr/lib/hive/lib/datanucleus-connectionpool-2.0.3.jar:/usr/lib/hive/lib/datanucleus-core-2.0.3-ZD5977-CDH5293.jar:/usr/lib/hive/lib/datanucleus-enhancer-2.0.3.jar:/usr/lib/hive/lib/datanucleus-rdbms-2.0.3.jar:/usr/lib/hive/lib/derby.jar:/usr/lib/hive/lib/guava-r06.jar:/usr/lib/hive/lib/haivvreo-1.0.7-cdh-2.jar:/usr/lib/hive/lib/high-scale-lib-1.1.1.jar:/usr/lib/hive/lib/hive-anttasks-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-cli-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-common-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-contrib-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-exec-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-hbase-handler-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-jdbc-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-metastore-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-serde-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-service-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-shims-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/jackson-core-asl-1.7.3.jar:/usr/lib/hive/lib/jackson-jaxrs-1.7.3.jar:/usr/lib/hive/lib/jackson-mapper-asl-1.7.3.jar:/usr/lib/hive/lib/jackson-xc-1.7.3.jar:/usr/lib/hive/lib/jdo2-api-2.3-ec.jar:/usr/lib/hive/lib/jline-0.9.94.jar:/usr/lib/hive/lib/json.jar:/usr/lib/hive/lib/junit-3.8.1.jar:/usr/lib/hive/lib/libfb303.jar:/usr/lib/hive/lib/libthrift.jar:/usr/lib/hive/lib/log4j-1.2.15.jar:/usr/lib/hive/lib/slf4j-api-1.6.1.jar:/usr/lib/hive/lib/slf4j-log4j12-1.6.1.jar:/usr/lib/hive/lib/snappy-java-1.0.3.2.jar:/usr/lib/hive/lib/stringtemplate-3.1b1.jar:/usr/lib/hive/lib/thrift-0.5.0.jar:/usr/lib/hive/lib/thrift-fb303-0.5.0.jar:/usr/lib/hive/lib/velocity-1.5.jar +
Michał Czerwiński 2012-11-13, 15:16
-
Re: Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udfMichał Czerwiński 2012-11-13, 15:40
Oh well I
changed PIG_CLASSPATH="$HCAT_HOME/share/hcatalog/hcatalog-0.4.0.jar:$HIVE_HOME/conf:$HADOOP_HOME/conf" into PIG_CLASSPATH="$HCAT_HOME/share/hcatalog/hcatalog-0.4.0.jar" having still hive libraries loaded via for file in $HIVE_HOME/lib/*.jar; do #echo "==> Adding $file" PIG_CLASSPATH="$PIG_CLASSPATH:$file" done and that seems to be working fine now, thanks a lot for help debugging it! On 13 November 2012 15:16, Michał Czerwiński <[EMAIL PROTECTED]>wrote: > Right, it looks like that: > > 2012-11-13 15:13:57,100 [main] DEBUG > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > - Adding jar to DistributedCache: > file:/opt/hcat/share/hcatalog/hcatalog-0.4.0.jar > 2012-11-13 15:13:57,428 [main] DEBUG > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > - Adding jar to DistributedCache: file:/usr/lib/hive/conf/ > 2012-11-13 15:13:57,433 [main] ERROR org.apache.pig.tools.grunt.Grunt - > ERROR 2017: Internal error creating job configuration. > Details at logfile: /opt/pig/trunk/pig_1352819617642.log > > #> ls -la /usr/lib/hive/conf/ > total 88 > drwxr-xr-x 2 root root 4096 2012-11-12 17:48 . > drwxr-xr-x 8 root root 4096 2012-11-09 17:29 .. > -rw-r--r-- 1 root root 39451 2012-11-08 10:24 hive-default.xml > -rw-r--r-- 1 root root 1408 2012-11-08 11:22 hive-env.sh > -rw-r--r-- 1 root root 1410 2012-11-08 10:24 hive-env.sh.template > -rw-r--r-- 1 root root 1637 2012-11-08 10:24 hive-exec-log4j.properties > -rw-r--r-- 1 root root 2005 2012-11-08 10:24 hive-log4j.properties > -rw-r--r-- 1 root root 4055 2012-11-08 11:22 hive-site-client.xml.tpl > -rw-rw-r-- 1 root root 4879 2012-11-09 15:30 hive-site.xml > -rw-r--r-- 1 root root 4903 2012-11-09 15:30 hive-site.xml.PIG.tpl > -rw-r--r-- 1 root root 3634 2012-11-08 11:22 hive-site.xml.tpl > > On 12 November 2012 18:45, Cheolsoo Park <[EMAIL PROTECTED]> wrote: > >> Can you try to print out debug message by adding "-d DEBUG" to the Pig >> command? It will print which additional files are added to distributed >> cache as follows: >> >> 2012-11-12 10:41:58,908 [main] DEBUG >> >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler >> - Adding jar to DistributedCache: >> file:/home/cheolsoo/apache-ant-1.8.4/lib/ant-antlr.jar >> 2012-11-12 10:41:59,099 [main] DEBUG >> >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler >> - Adding jar to DistributedCache: file:/etc/hadoop-0.20/conf.pseudo/ >> >> This will tell you which file it was shipping right before failed. That >> will probably give you a hint on where to look into further. >> >> Thanks, >> Cheolsoo >> >> >> On Mon, Nov 12, 2012 at 10:29 AM, Michał Czerwiński < >> [EMAIL PROTECTED]> wrote: >> >> > Seems like exactly the same error. >> > >> > I do it like that: >> > >> > > export JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::") >> > which resolves to /usr/lib/jvm/java-6-sun-1.6.0.21/jre/ >> > >> > > bin/pig >> > >> > >> -Dpig.additional.jars=/opt/hcat/share/hcatalog/hcatalog-0.4.0.jar:/usr/lib/hive/conf:/usr/lib/hadoop-0.20/conf:/usr/lib/hive/lib/ant-contrib-1.0b3.jar:/usr/lib/hive/lib/antlr-runtime-3.0.1.jar:/usr/lib/hive/lib/asm-3.1.jar:/usr/lib/hive/lib/avro-1.5.4.jar:/usr/lib/hive/lib/avro-ipc-1.5.4.jar:/usr/lib/hive/lib/avro-mapred-1.5.4.jar:/usr/lib/hive/lib/commons-cli-1.2.jar:/usr/lib/hive/lib/commons-codec-1.3.jar:/usr/lib/hive/lib/commons-collections-3.2.1.jar:/usr/lib/hive/lib/commons-dbcp-1.4.jar:/usr/lib/hive/lib/commons-lang-2.4.jar:/usr/lib/hive/lib/commons-logging-1.0.4.jar:/usr/lib/hive/lib/commons-logging-api-1.0.4.jar:/usr/lib/hive/lib/commons-pool-1.5.4.jar:/usr/lib/hive/lib/datanucleus-connectionpool-2.0.3.jar:/usr/lib/hive/lib/datanucleus-core-2.0.3-ZD5977-CDH5293.jar:/usr/lib/hive/lib/datanucleus-enhancer-2.0.3.jar:/usr/lib/hive/lib/datanucleus-rdbms-2.0.3.jar:/usr/lib/hive/lib/derby.jar:/usr/lib/hive/lib/guava-r06.jar:/usr/lib/hive/lib/haivvreo-1.0.7-cdh-2.jar:/usr/lib/hive/lib/high-scale-lib-1.1.1.jar:/usr/lib/hive/lib/hive-anttasks-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-cli-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-common-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-contrib-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-exec-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-hbase-handler-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-jdbc-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-metastore-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-serde-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-service-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-shims-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/jackson-core-asl-1.7.3.jar:/usr/lib/hive/lib/jackson-jaxrs-1.7.3.jar:/usr/lib/hive/lib/jackson-mapper-asl-1.7.3.jar:/usr/lib/hive/lib/jackson-xc-1.7.3.jar:/usr/lib/hive/lib/jdo2-api-2.3-ec.jar:/usr/lib/hive/lib/jline-0.9.94.jar:/usr/lib/hive/lib/json.jar:/usr/lib/hive/lib/junit-3.8.1.jar:/usr/lib/hive/lib/libfb303.jar:/usr/lib/hive/lib/libthrift.jar:/usr/lib/hive/lib/log4j-1.2.15.jar:/usr/lib/hive/lib/slf4j-api-1.6.1.jar:/usr/lib/hive/lib/slf4j-log4j12-1.6.1.jar:/usr/lib/hive/lib/snappy-java-1.0.3.2.jar:/usr/lib/hive/lib/stringtemplate-3.1b1.jar:/usr/lib/hive/lib/thrift-0.5.0.jar:/usr/lib/hive/lib/thrift-fb303-0.5.0.jar:/usr/lib/hive/lib/velocity-1.5.jar +
Michał Czerwiński 2012-11-13, 15:40
-
Re: Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udfCheolsoo Park 2012-11-13, 17:18
Hi Michal,
Thanks for sharing your workaround. I think that Pig should be able to handle empty file names in -Dpig.additional.jars, so users don't have to spend hours to debug problems like this. So I filed a JIRA: https://issues.apache.org/jira/browse/PIG-3046 We will get this fixed in a future release. Thanks, Cheolsoo On Tue, Nov 13, 2012 at 7:40 AM, Michał Czerwiński <[EMAIL PROTECTED] > wrote: > Oh well I > changed > PIG_CLASSPATH="$HCAT_HOME/share/hcatalog/hcatalog-0.4.0.jar:$HIVE_HOME/conf:$HADOOP_HOME/conf" > into > PIG_CLASSPATH="$HCAT_HOME/share/hcatalog/hcatalog-0.4.0.jar" > > having still hive libraries loaded via > for file in $HIVE_HOME/lib/*.jar; do > #echo "==> Adding $file" > PIG_CLASSPATH="$PIG_CLASSPATH:$file" > done > > and that seems to be working fine now, thanks a lot for help debugging it! > > On 13 November 2012 15:16, Michał Czerwiński <[EMAIL PROTECTED] > >wrote: > > > Right, it looks like that: > > > > 2012-11-13 15:13:57,100 [main] DEBUG > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > > - Adding jar to DistributedCache: > > file:/opt/hcat/share/hcatalog/hcatalog-0.4.0.jar > > 2012-11-13 15:13:57,428 [main] DEBUG > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > > - Adding jar to DistributedCache: file:/usr/lib/hive/conf/ > > 2012-11-13 15:13:57,433 [main] ERROR org.apache.pig.tools.grunt.Grunt - > > ERROR 2017: Internal error creating job configuration. > > Details at logfile: /opt/pig/trunk/pig_1352819617642.log > > > > #> ls -la /usr/lib/hive/conf/ > > total 88 > > drwxr-xr-x 2 root root 4096 2012-11-12 17:48 . > > drwxr-xr-x 8 root root 4096 2012-11-09 17:29 .. > > -rw-r--r-- 1 root root 39451 2012-11-08 10:24 hive-default.xml > > -rw-r--r-- 1 root root 1408 2012-11-08 11:22 hive-env.sh > > -rw-r--r-- 1 root root 1410 2012-11-08 10:24 hive-env.sh.template > > -rw-r--r-- 1 root root 1637 2012-11-08 10:24 hive-exec-log4j.properties > > -rw-r--r-- 1 root root 2005 2012-11-08 10:24 hive-log4j.properties > > -rw-r--r-- 1 root root 4055 2012-11-08 11:22 hive-site-client.xml.tpl > > -rw-rw-r-- 1 root root 4879 2012-11-09 15:30 hive-site.xml > > -rw-r--r-- 1 root root 4903 2012-11-09 15:30 hive-site.xml.PIG.tpl > > -rw-r--r-- 1 root root 3634 2012-11-08 11:22 hive-site.xml.tpl > > > > On 12 November 2012 18:45, Cheolsoo Park <[EMAIL PROTECTED]> wrote: > > > >> Can you try to print out debug message by adding "-d DEBUG" to the Pig > >> command? It will print which additional files are added to distributed > >> cache as follows: > >> > >> 2012-11-12 10:41:58,908 [main] DEBUG > >> > >> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > >> - Adding jar to DistributedCache: > >> file:/home/cheolsoo/apache-ant-1.8.4/lib/ant-antlr.jar > >> 2012-11-12 10:41:59,099 [main] DEBUG > >> > >> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > >> - Adding jar to DistributedCache: file:/etc/hadoop-0.20/conf.pseudo/ > >> > >> This will tell you which file it was shipping right before failed. That > >> will probably give you a hint on where to look into further. > >> > >> Thanks, > >> Cheolsoo > >> > >> > >> On Mon, Nov 12, 2012 at 10:29 AM, Michał Czerwiński < > >> [EMAIL PROTECTED]> wrote: > >> > >> > Seems like exactly the same error. > >> > > >> > I do it like that: > >> > > >> > > export JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::") > >> > which resolves to /usr/lib/jvm/java-6-sun-1.6.0.21/jre/ > >> > > >> > > bin/pig > >> > > >> > > >> > -Dpig.additional.jars=/opt/hcat/share/hcatalog/hcatalog-0.4.0.jar:/usr/lib/hive/conf:/usr/lib/hadoop-0.20/conf:/usr/lib/hive/lib/ant-contrib-1.0b3.jar:/usr/lib/hive/lib/antlr-runtime-3.0.1.jar:/usr/lib/hive/lib/asm-3.1.jar:/usr/lib/hive/lib/avro-1.5.4.jar:/usr/lib/hive/lib/avro-ipc-1.5.4.jar:/usr/lib/hive/lib/avro-mapred-1.5.4.jar:/usr/lib/hive/lib/commons-cli-1.2.jar:/usr/lib/hive/lib/commons-codec-1.3.jar:/usr/lib/hive/lib/commons-collections-3.2.1.jar:/usr/lib/hive/lib/commons-dbcp-1.4.jar:/usr/lib/hive/lib/commons-lang-2.4.jar:/usr/lib/hive/lib/commons-logging-1.0.4.jar:/usr/lib/hive/lib/commons-logging-api-1.0.4.jar:/usr/lib/hive/lib/commons-pool-1.5.4.jar:/usr/lib/hive/lib/datanucleus-connectionpool-2.0.3.jar:/usr/lib/hive/lib/datanucleus-core-2.0.3-ZD5977-CDH5293.jar:/usr/lib/hive/lib/datanucleus-enhancer-2.0.3.jar:/usr/lib/hive/lib/datanucleus-rdbms-2.0.3.jar:/usr/lib/hive/lib/derby.jar:/usr/lib/hive/lib/guava-r06.jar:/usr/lib/hive/lib/haivvreo-1.0.7-cdh-2.jar:/usr/lib/hive/lib/high-scale-lib-1.1.1.jar:/usr/lib/hive/lib/hive-anttasks-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-cli-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-common-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-contrib-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-exec-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-hbase-handler-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-jdbc-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-metastore-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-serde-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-service-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-shims-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/jackson-core-asl-1.7.3.jar:/usr/lib/hive/lib/jackson-jaxrs-1.7.3.jar:/usr/lib/hive/lib/jackson-mapper-asl-1.7.3.jar:/usr/lib/hive/lib/jackson-xc-1.7.3.jar:/usr/lib/hive/lib/jdo2-api-2.3-ec.jar:/usr/lib/hive/lib/jline-0.9.94.jar:/usr/lib/hive/lib/json.jar:/usr/lib/hive/lib/junit-3.8.1.jar:/usr/lib/hive/lib/libfb303.jar:/usr/lib/hive/lib/libthrift.jar:/usr/lib/hive/lib/log4j-1.2.15.jar:/usr/lib/hive/lib/slf4j-api-1.6.1.jar:/usr/lib/hive/lib/slf4j-log4j12-1.6.1.jar:/usr/lib/hive/lib/snappy-java-1.0.3.2.jar:/usr/lib/hive/lib/stringtemplate-3.1b1.jar:/usr/lib/hive/lib/thrift-0.5.0.jar:/usr/lib/hive/lib/thrift-fb303-0.5.0.jar:/usr/lib/hive/lib/velocity-1.5.jar +
Cheolsoo Park 2012-11-13, 17:18
-
Re: Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udfMichał Czerwiński 2012-11-13, 17:34
Yeah, so just to be clear under pig > 0.10 the issue seems to be exactly as
you describe + issue occurs whenever you specify in the -Dpig.additional.jars a directory path instead of the file path. This is quite often happening because its advised on forums to include HIVE_HOME and HADOOP_HOME in the PIG_CLASSPATH which is then passed to -Dpig.additional.jars. I put a comment in the jira ticket. Thanks again Cheolsoo! On 13 November 2012 17:18, Cheolsoo Park <[EMAIL PROTECTED]> wrote: > Hi Michal, > > Thanks for sharing your workaround. > > I think that Pig should be able to handle empty file names in > -Dpig.additional.jars, so users don't have to spend hours to debug problems > like this. So I filed a JIRA: > https://issues.apache.org/jira/browse/PIG-3046 > > We will get this fixed in a future release. > > Thanks, > Cheolsoo > > On Tue, Nov 13, 2012 at 7:40 AM, Michał Czerwiński < > [EMAIL PROTECTED] > > wrote: > > > Oh well I > > changed > > > PIG_CLASSPATH="$HCAT_HOME/share/hcatalog/hcatalog-0.4.0.jar:$HIVE_HOME/conf:$HADOOP_HOME/conf" > > into > > PIG_CLASSPATH="$HCAT_HOME/share/hcatalog/hcatalog-0.4.0.jar" > > > > having still hive libraries loaded via > > for file in $HIVE_HOME/lib/*.jar; do > > #echo "==> Adding $file" > > PIG_CLASSPATH="$PIG_CLASSPATH:$file" > > done > > > > and that seems to be working fine now, thanks a lot for help debugging > it! > > > > On 13 November 2012 15:16, Michał Czerwiński <[EMAIL PROTECTED] > > >wrote: > > > > > Right, it looks like that: > > > > > > 2012-11-13 15:13:57,100 [main] DEBUG > > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > > > - Adding jar to DistributedCache: > > > file:/opt/hcat/share/hcatalog/hcatalog-0.4.0.jar > > > 2012-11-13 15:13:57,428 [main] DEBUG > > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > > > - Adding jar to DistributedCache: file:/usr/lib/hive/conf/ > > > 2012-11-13 15:13:57,433 [main] ERROR org.apache.pig.tools.grunt.Grunt - > > > ERROR 2017: Internal error creating job configuration. > > > Details at logfile: /opt/pig/trunk/pig_1352819617642.log > > > > > > #> ls -la /usr/lib/hive/conf/ > > > total 88 > > > drwxr-xr-x 2 root root 4096 2012-11-12 17:48 . > > > drwxr-xr-x 8 root root 4096 2012-11-09 17:29 .. > > > -rw-r--r-- 1 root root 39451 2012-11-08 10:24 hive-default.xml > > > -rw-r--r-- 1 root root 1408 2012-11-08 11:22 hive-env.sh > > > -rw-r--r-- 1 root root 1410 2012-11-08 10:24 hive-env.sh.template > > > -rw-r--r-- 1 root root 1637 2012-11-08 10:24 > hive-exec-log4j.properties > > > -rw-r--r-- 1 root root 2005 2012-11-08 10:24 hive-log4j.properties > > > -rw-r--r-- 1 root root 4055 2012-11-08 11:22 hive-site-client.xml.tpl > > > -rw-rw-r-- 1 root root 4879 2012-11-09 15:30 hive-site.xml > > > -rw-r--r-- 1 root root 4903 2012-11-09 15:30 hive-site.xml.PIG.tpl > > > -rw-r--r-- 1 root root 3634 2012-11-08 11:22 hive-site.xml.tpl > > > > > > On 12 November 2012 18:45, Cheolsoo Park <[EMAIL PROTECTED]> > wrote: > > > > > >> Can you try to print out debug message by adding "-d DEBUG" to the Pig > > >> command? It will print which additional files are added to distributed > > >> cache as follows: > > >> > > >> 2012-11-12 10:41:58,908 [main] DEBUG > > >> > > >> > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > > >> - Adding jar to DistributedCache: > > >> file:/home/cheolsoo/apache-ant-1.8.4/lib/ant-antlr.jar > > >> 2012-11-12 10:41:59,099 [main] DEBUG > > >> > > >> > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > > >> - Adding jar to DistributedCache: file:/etc/hadoop-0.20/conf.pseudo/ > > >> > > >> This will tell you which file it was shipping right before failed. > That > > >> will probably give you a hint on where to look into further. > > >> > > >> Thanks, > > >> Cheolsoo > > >> > > >> > > >> On Mon, Nov 12, 2012 at 10:29 AM, Michał Czerwiński < +
Michał Czerwiński 2012-11-13, 17:34
|