Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udf


+
Michał Czerwiński 2012-11-12, 16:47
+
Cheolsoo Park 2012-11-12, 17:37
+
Michał Czerwiński 2012-11-12, 17:59
+
Cheolsoo Park 2012-11-12, 18:09
+
Michał Czerwiński 2012-11-12, 18:29
Copy link to this message
-
Re: Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udf
Cheolsoo Park 2012-11-12, 18:45
Can you try to print out debug message by adding "-d DEBUG" to the Pig
command? It will print which additional files are added to distributed
cache as follows:

2012-11-12 10:41:58,908 [main] DEBUG
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Adding jar to DistributedCache:
file:/home/cheolsoo/apache-ant-1.8.4/lib/ant-antlr.jar
2012-11-12 10:41:59,099 [main] DEBUG
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Adding jar to DistributedCache: file:/etc/hadoop-0.20/conf.pseudo/

This will tell you which file it was shipping right before failed. That
will probably give you a hint on where to look into further.

Thanks,
Cheolsoo
On Mon, Nov 12, 2012 at 10:29 AM, Michał Czerwiński <
[EMAIL PROTECTED]> wrote:

> Seems like exactly the same error.
>
> I do it like that:
>
> > export JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::")
> which resolves to /usr/lib/jvm/java-6-sun-1.6.0.21/jre/
>
> > bin/pig
>
> -Dpig.additional.jars=/opt/hcat/share/hcatalog/hcatalog-0.4.0.jar:/usr/lib/hive/conf:/usr/lib/hadoop-0.20/conf:/usr/lib/hive/lib/ant-contrib-1.0b3.jar:/usr/lib/hive/lib/antlr-runtime-3.0.1.jar:/usr/lib/hive/lib/asm-3.1.jar:/usr/lib/hive/lib/avro-1.5.4.jar:/usr/lib/hive/lib/avro-ipc-1.5.4.jar:/usr/lib/hive/lib/avro-mapred-1.5.4.jar:/usr/lib/hive/lib/commons-cli-1.2.jar:/usr/lib/hive/lib/commons-codec-1.3.jar:/usr/lib/hive/lib/commons-collections-3.2.1.jar:/usr/lib/hive/lib/commons-dbcp-1.4.jar:/usr/lib/hive/lib/commons-lang-2.4.jar:/usr/lib/hive/lib/commons-logging-1.0.4.jar:/usr/lib/hive/lib/commons-logging-api-1.0.4.jar:/usr/lib/hive/lib/commons-pool-1.5.4.jar:/usr/lib/hive/lib/datanucleus-connectionpool-2.0.3.jar:/usr/lib/hive/lib/datanucleus-core-2.0.3-ZD5977-CDH5293.jar:/usr/lib/hive/lib/datanucleus-enhancer-2.0.3.jar:/usr/lib/hive/lib/datanucleus-rdbms-2.0.3.jar:/usr/lib/hive/lib/derby.jar:/usr/lib/hive/lib/guava-r06.jar:/usr/lib/hive/lib/haivvreo-1.0.7-cdh-2.jar:/usr/lib/hive/lib/high-scale-lib-1.1.1.jar:/usr/lib/hive/lib/hive-anttasks-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-cli-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-common-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-contrib-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-exec-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-hbase-handler-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-jdbc-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-metastore-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-serde-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-service-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-shims-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/jackson-core-asl-1.7.3.jar:/usr/lib/hive/lib/jackson-jaxrs-1.7.3.jar:/usr/lib/hive/lib/jackson-mapper-asl-1.7.3.jar:/usr/lib/hive/lib/jackson-xc-1.7.3.jar:/usr/lib/hive/lib/jdo2-api-2.3-ec.jar:/usr/lib/hive/lib/jline-0.9.94.jar:/usr/lib/hive/lib/json.jar:/usr/lib/hive/lib/junit-3.8.1.jar:/usr/lib/hive/lib/libfb303.jar:/usr/lib/hive/lib/libthrift.jar:/usr/lib/hive/lib/log4j-1.2.15.jar:/usr/lib/hive/lib/slf4j-api-1.6.1.jar:/usr/lib/hive/lib/slf4j-log4j12-1.6.1.jar:/usr/lib/hive/lib/snappy-java-1.0.3.2.jar:/usr/lib/hive/lib/stringtemplate-3.1b1.jar:/usr/lib/hive/lib/thrift-0.5.0.jar:/usr/lib/hive/lib/thrift-fb303-0.5.0.jar:/usr/lib/hive/lib/velocity-1.5.jar
>
> which is basically
> echo  bin/pig -Dpig.additional.jars="$PIG_CLASSPATH" "$@"
>
> grunt> A = load 'xxx.yyy' using org.apache.hcatalog.pig.HCatLoader;
> 2012-11-12 18:22:38,398 [main] INFO  hive.metastore - Trying to connect to
> metastore with URI thrift://hcatalog:10002
> 2012-11-12 18:22:38,506 [main] INFO  hive.metastore - Connected to
> metastore.
> grunt> B = FILTER A BY keyword=='FU';
> grunt> ll = LIMIT B 10;
> grunt> dump ll;
>
> 012-11-12 18:22:47,567 [main] INFO
>  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the
> script: FILTER,LIMIT
> 2012-11-12 18:22:47,901 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler -
> File concatenation threshold: 100 optimistic? false
> 2012-11-12 18:22:48,026 [main] INFO
+
Michał Czerwiński 2012-11-13, 15:16
+
Michał Czerwiński 2012-11-13, 15:40
+
Cheolsoo Park 2012-11-13, 17:18
+
Michał Czerwiński 2012-11-13, 17:34