Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udf


+
Michał Czerwiński 2012-11-12, 16:47
+
Cheolsoo Park 2012-11-12, 17:37
+
Michał Czerwiński 2012-11-12, 17:59
+
Cheolsoo Park 2012-11-12, 18:09
+
Michał Czerwiński 2012-11-12, 18:29
+
Cheolsoo Park 2012-11-12, 18:45
+
Michał Czerwiński 2012-11-13, 15:16
+
Michał Czerwiński 2012-11-13, 15:40
Copy link to this message
-
Re: Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udf
Cheolsoo Park 2012-11-13, 17:18
Hi Michal,

Thanks for sharing your workaround.

I think that Pig should be able to handle empty file names in
-Dpig.additional.jars, so users don't have to spend hours to debug problems
like this. So I filed a JIRA:
https://issues.apache.org/jira/browse/PIG-3046

We will get this fixed in a future release.

Thanks,
Cheolsoo

On Tue, Nov 13, 2012 at 7:40 AM, Michał Czerwiński <[EMAIL PROTECTED]
> wrote:

> Oh well I
> changed
> PIG_CLASSPATH="$HCAT_HOME/share/hcatalog/hcatalog-0.4.0.jar:$HIVE_HOME/conf:$HADOOP_HOME/conf"
> into
> PIG_CLASSPATH="$HCAT_HOME/share/hcatalog/hcatalog-0.4.0.jar"
>
> having still hive libraries loaded via
> for file in $HIVE_HOME/lib/*.jar; do
>     #echo "==> Adding $file"
>     PIG_CLASSPATH="$PIG_CLASSPATH:$file"
> done
>
> and that seems to be working fine now, thanks a lot for help debugging it!
>
> On 13 November 2012 15:16, Michał Czerwiński <[EMAIL PROTECTED]
> >wrote:
>
> > Right, it looks like that:
> >
> > 2012-11-13 15:13:57,100 [main] DEBUG
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> > - Adding jar to DistributedCache:
> > file:/opt/hcat/share/hcatalog/hcatalog-0.4.0.jar
> > 2012-11-13 15:13:57,428 [main] DEBUG
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> > - Adding jar to DistributedCache: file:/usr/lib/hive/conf/
> > 2012-11-13 15:13:57,433 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> > ERROR 2017: Internal error creating job configuration.
> > Details at logfile: /opt/pig/trunk/pig_1352819617642.log
> >
> > #> ls -la /usr/lib/hive/conf/
> > total 88
> > drwxr-xr-x 2 root root  4096 2012-11-12 17:48 .
> > drwxr-xr-x 8 root root  4096 2012-11-09 17:29 ..
> > -rw-r--r-- 1 root root 39451 2012-11-08 10:24 hive-default.xml
> > -rw-r--r-- 1 root root  1408 2012-11-08 11:22 hive-env.sh
> > -rw-r--r-- 1 root root  1410 2012-11-08 10:24 hive-env.sh.template
> > -rw-r--r-- 1 root root  1637 2012-11-08 10:24 hive-exec-log4j.properties
> > -rw-r--r-- 1 root root  2005 2012-11-08 10:24 hive-log4j.properties
> > -rw-r--r-- 1 root root  4055 2012-11-08 11:22 hive-site-client.xml.tpl
> > -rw-rw-r-- 1 root root  4879 2012-11-09 15:30 hive-site.xml
> > -rw-r--r-- 1 root root  4903 2012-11-09 15:30 hive-site.xml.PIG.tpl
> > -rw-r--r-- 1 root root  3634 2012-11-08 11:22 hive-site.xml.tpl
> >
> > On 12 November 2012 18:45, Cheolsoo Park <[EMAIL PROTECTED]> wrote:
> >
> >> Can you try to print out debug message by adding "-d DEBUG" to the Pig
> >> command? It will print which additional files are added to distributed
> >> cache as follows:
> >>
> >> 2012-11-12 10:41:58,908 [main] DEBUG
> >>
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> >> - Adding jar to DistributedCache:
> >> file:/home/cheolsoo/apache-ant-1.8.4/lib/ant-antlr.jar
> >> 2012-11-12 10:41:59,099 [main] DEBUG
> >>
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> >> - Adding jar to DistributedCache: file:/etc/hadoop-0.20/conf.pseudo/
> >>
> >> This will tell you which file it was shipping right before failed. That
> >> will probably give you a hint on where to look into further.
> >>
> >> Thanks,
> >> Cheolsoo
> >>
> >>
> >> On Mon, Nov 12, 2012 at 10:29 AM, Michał Czerwiński <
> >> [EMAIL PROTECTED]> wrote:
> >>
> >> > Seems like exactly the same error.
> >> >
> >> > I do it like that:
> >> >
> >> > > export JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::")
> >> > which resolves to /usr/lib/jvm/java-6-sun-1.6.0.21/jre/
> >> >
> >> > > bin/pig
> >> >
> >> >
> >>
> -Dpig.additional.jars=/opt/hcat/share/hcatalog/hcatalog-0.4.0.jar:/usr/lib/hive/conf:/usr/lib/hadoop-0.20/conf:/usr/lib/hive/lib/ant-contrib-1.0b3.jar:/usr/lib/hive/lib/antlr-runtime-3.0.1.jar:/usr/lib/hive/lib/asm-3.1.jar:/usr/lib/hive/lib/avro-1.5.4.jar:/usr/lib/hive/lib/avro-ipc-1.5.4.jar:/usr/lib/hive/lib/avro-mapred-1.5.4.jar:/usr/lib/hive/lib/commons-cli-1.2.jar:/usr/lib/hive/lib/commons-codec-1.3.jar:/usr/lib/hive/lib/commons-collections-3.2.1.jar:/usr/lib/hive/lib/commons-dbcp-1.4.jar:/usr/lib/hive/lib/commons-lang-2.4.jar:/usr/lib/hive/lib/commons-logging-1.0.4.jar:/usr/lib/hive/lib/commons-logging-api-1.0.4.jar:/usr/lib/hive/lib/commons-pool-1.5.4.jar:/usr/lib/hive/lib/datanucleus-connectionpool-2.0.3.jar:/usr/lib/hive/lib/datanucleus-core-2.0.3-ZD5977-CDH5293.jar:/usr/lib/hive/lib/datanucleus-enhancer-2.0.3.jar:/usr/lib/hive/lib/datanucleus-rdbms-2.0.3.jar:/usr/lib/hive/lib/derby.jar:/usr/lib/hive/lib/guava-r06.jar:/usr/lib/hive/lib/haivvreo-1.0.7-cdh-2.jar:/usr/lib/hive/lib/high-scale-lib-1.1.1.jar:/usr/lib/hive/lib/hive-anttasks-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-cli-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-common-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-contrib-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-exec-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-hbase-handler-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-jdbc-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-metastore-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-serde-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-service-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-shims-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/jackson-core-asl-1.7.3.jar:/usr/lib/hive/lib/jackson-jaxrs-1.7.3.jar:/usr/lib/hive/lib/jackson-mapper-asl-1.7.3.jar:/usr/lib/hive/lib/jackson-xc-1.7.3.jar:/usr/lib/hive/lib/jdo2-api-2.3-ec.jar:/usr/lib/hive/lib/jline-0.9.94.jar:/usr/lib/hive/lib/json.jar:/usr/lib/hive/lib/junit-3.8.1.jar:/usr/lib/hive/lib/libfb303.jar:/usr/lib/hive/lib/libthrift.jar:/usr/lib/hive/lib/log4j-1.2.15.jar:/usr/lib/hive/lib/slf4j-api-1.6.1.jar:/usr/lib/hive/lib/slf4j-log4j12-1.6.1.jar:/usr/lib/hive/lib/snappy-java-1.0.3.2.jar:/usr/lib/hive/lib/stringtemplate-3.1b1.jar:/usr/lib/hive/lib/thrift-0.5.0.jar:/usr/lib/hive/lib/thrift-fb303-0.5.0.jar:/usr/lib/hive/lib/velocity-1.5.jar
+
Michał Czerwiński 2012-11-13, 17:34