Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udf


+
Michał Czerwiński 2012-11-12, 16:47
+
Cheolsoo Park 2012-11-12, 17:37
+
Michał Czerwiński 2012-11-12, 17:59
+
Cheolsoo Park 2012-11-12, 18:09
+
Michał Czerwiński 2012-11-12, 18:29
+
Cheolsoo Park 2012-11-12, 18:45
+
Michał Czerwiński 2012-11-13, 15:16
+
Michał Czerwiński 2012-11-13, 15:40
Copy link to this message
-
Re: Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udf
Hi Michal,

Thanks for sharing your workaround.

I think that Pig should be able to handle empty file names in
-Dpig.additional.jars, so users don't have to spend hours to debug problems
like this. So I filed a JIRA:
https://issues.apache.org/jira/browse/PIG-3046

We will get this fixed in a future release.

Thanks,
Cheolsoo

On Tue, Nov 13, 2012 at 7:40 AM, Michał Czerwiński <[EMAIL PROTECTED]
> wrote:

> Oh well I
> changed
> PIG_CLASSPATH="$HCAT_HOME/share/hcatalog/hcatalog-0.4.0.jar:$HIVE_HOME/conf:$HADOOP_HOME/conf"
> into
> PIG_CLASSPATH="$HCAT_HOME/share/hcatalog/hcatalog-0.4.0.jar"
>
> having still hive libraries loaded via
> for file in $HIVE_HOME/lib/*.jar; do
>     #echo "==> Adding $file"
>     PIG_CLASSPATH="$PIG_CLASSPATH:$file"
> done
>
> and that seems to be working fine now, thanks a lot for help debugging it!
>
> On 13 November 2012 15:16, Michał Czerwiński <[EMAIL PROTECTED]
> >wrote:
>
> > Right, it looks like that:
> >
> > 2012-11-13 15:13:57,100 [main] DEBUG
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> > - Adding jar to DistributedCache:
> > file:/opt/hcat/share/hcatalog/hcatalog-0.4.0.jar
> > 2012-11-13 15:13:57,428 [main] DEBUG
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> > - Adding jar to DistributedCache: file:/usr/lib/hive/conf/
> > 2012-11-13 15:13:57,433 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> > ERROR 2017: Internal error creating job configuration.
> > Details at logfile: /opt/pig/trunk/pig_1352819617642.log
> >
> > #> ls -la /usr/lib/hive/conf/
> > total 88
> > drwxr-xr-x 2 root root  4096 2012-11-12 17:48 .
> > drwxr-xr-x 8 root root  4096 2012-11-09 17:29 ..
> > -rw-r--r-- 1 root root 39451 2012-11-08 10:24 hive-default.xml
> > -rw-r--r-- 1 root root  1408 2012-11-08 11:22 hive-env.sh
> > -rw-r--r-- 1 root root  1410 2012-11-08 10:24 hive-env.sh.template
> > -rw-r--r-- 1 root root  1637 2012-11-08 10:24 hive-exec-log4j.properties
> > -rw-r--r-- 1 root root  2005 2012-11-08 10:24 hive-log4j.properties
> > -rw-r--r-- 1 root root  4055 2012-11-08 11:22 hive-site-client.xml.tpl
> > -rw-rw-r-- 1 root root  4879 2012-11-09 15:30 hive-site.xml
> > -rw-r--r-- 1 root root  4903 2012-11-09 15:30 hive-site.xml.PIG.tpl
> > -rw-r--r-- 1 root root  3634 2012-11-08 11:22 hive-site.xml.tpl
> >
> > On 12 November 2012 18:45, Cheolsoo Park <[EMAIL PROTECTED]> wrote:
> >
> >> Can you try to print out debug message by adding "-d DEBUG" to the Pig
> >> command? It will print which additional files are added to distributed
> >> cache as follows:
> >>
> >> 2012-11-12 10:41:58,908 [main] DEBUG
> >>
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> >> - Adding jar to DistributedCache:
> >> file:/home/cheolsoo/apache-ant-1.8.4/lib/ant-antlr.jar
> >> 2012-11-12 10:41:59,099 [main] DEBUG
> >>
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> >> - Adding jar to DistributedCache: file:/etc/hadoop-0.20/conf.pseudo/
> >>
> >> This will tell you which file it was shipping right before failed. That
> >> will probably give you a hint on where to look into further.
> >>
> >> Thanks,
> >> Cheolsoo
> >>
> >>
> >> On Mon, Nov 12, 2012 at 10:29 AM, Michał Czerwiński <
> >> [EMAIL PROTECTED]> wrote:
> >>
> >> > Seems like exactly the same error.
> >> >
> >> > I do it like that:
> >> >
> >> > > export JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::")
> >> > which resolves to /usr/lib/jvm/java-6-sun-1.6.0.21/jre/
> >> >
> >> > > bin/pig
> >> >
> >> >
> >>
> -Dpig.additional.jars=/opt/hcat/share/hcatalog/hcatalog-0.4.0.jar:/usr/lib/hive/conf:/usr/lib/hadoop-0.20/conf:/usr/lib/hive/lib/ant-contrib-1.0b3.jar:/usr/lib/hive/lib/antlr-runtime-3.0.1.jar:/usr/lib/hive/lib/asm-3.1.jar:/usr/lib/hive/lib/avro-1.5.4.jar:/usr/lib/hive/lib/avro-ipc-1.5.4.jar:/usr/lib/hive/lib/avro-mapred-1.5.4.jar:/usr/lib/hive/lib/commons-cli-1.2.jar:/usr/lib/hive/lib/commons-codec-1.3.jar:/usr/lib/hive/lib/commons-collections-3.2.1.jar:/usr/lib/hive/lib/commons-dbcp-1.4.jar:/usr/lib/hive/lib/commons-lang-2.4.jar:/usr/lib/hive/lib/commons-logging-1.0.4.jar:/usr/lib/hive/lib/commons-logging-api-1.0.4.jar:/usr/lib/hive/lib/commons-pool-1.5.4.jar:/usr/lib/hive/lib/datanucleus-connectionpool-2.0.3.jar:/usr/lib/hive/lib/datanucleus-core-2.0.3-ZD5977-CDH5293.jar:/usr/lib/hive/lib/datanucleus-enhancer-2.0.3.jar:/usr/lib/hive/lib/datanucleus-rdbms-2.0.3.jar:/usr/lib/hive/lib/derby.jar:/usr/lib/hive/lib/guava-r06.jar:/usr/lib/hive/lib/haivvreo-1.0.7-cdh-2.jar:/usr/lib/hive/lib/high-scale-lib-1.1.1.jar:/usr/lib/hive/lib/hive-anttasks-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-cli-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-common-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-contrib-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-exec-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-hbase-handler-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-jdbc-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-metastore-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-serde-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-service-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-shims-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/jackson-core-asl-1.7.3.jar:/usr/lib/hive/lib/jackson-jaxrs-1.7.3.jar:/usr/lib/hive/lib/jackson-mapper-asl-1.7.3.jar:/usr/lib/hive/lib/jackson-xc-1.7.3.jar:/usr/lib/hive/lib/jdo2-api-2.3-ec.jar:/usr/lib/hive/lib/jline-0.9.94.jar:/usr/lib/hive/lib/json.jar:/usr/lib/hive/lib/junit-3.8.1.jar:/usr/lib/hive/lib/libfb303.jar:/usr/lib/hive/lib/libthrift.jar:/usr/lib/hive/lib/log4j-1.2.15.jar:/usr/lib/hive/lib/slf4j-api-1.6.1.jar:/usr/lib/hive/lib/slf4j-log4j12-1.6.1.jar:/usr/lib/hive/lib/snappy-java-1.0.3.2.jar:/usr/lib/hive/lib/stringtemplate-3.1b1.jar:/usr/lib/hive/lib/thrift-0.5.0.jar:/usr/lib/hive/lib/thrift-fb303-0.5.0.jar:/usr/lib/hive/lib/velocity-1.5.jar
+
Michał Czerwiński 2012-11-13, 17:34
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB