Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udf


+
Michał Czerwiński 2012-11-12, 16:47
+
Cheolsoo Park 2012-11-12, 17:37
+
Michał Czerwiński 2012-11-12, 17:59
+
Cheolsoo Park 2012-11-12, 18:09
+
Michał Czerwiński 2012-11-12, 18:29
Copy link to this message
-
Re: Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udf
Can you try to print out debug message by adding "-d DEBUG" to the Pig
command? It will print which additional files are added to distributed
cache as follows:

2012-11-12 10:41:58,908 [main] DEBUG
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Adding jar to DistributedCache:
file:/home/cheolsoo/apache-ant-1.8.4/lib/ant-antlr.jar
2012-11-12 10:41:59,099 [main] DEBUG
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Adding jar to DistributedCache: file:/etc/hadoop-0.20/conf.pseudo/

This will tell you which file it was shipping right before failed. That
will probably give you a hint on where to look into further.

Thanks,
Cheolsoo
On Mon, Nov 12, 2012 at 10:29 AM, Michał Czerwiński <
[EMAIL PROTECTED]> wrote:

> Seems like exactly the same error.
>
> I do it like that:
>
> > export JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::")
> which resolves to /usr/lib/jvm/java-6-sun-1.6.0.21/jre/
>
> > bin/pig
>
> -Dpig.additional.jars=/opt/hcat/share/hcatalog/hcatalog-0.4.0.jar:/usr/lib/hive/conf:/usr/lib/hadoop-0.20/conf:/usr/lib/hive/lib/ant-contrib-1.0b3.jar:/usr/lib/hive/lib/antlr-runtime-3.0.1.jar:/usr/lib/hive/lib/asm-3.1.jar:/usr/lib/hive/lib/avro-1.5.4.jar:/usr/lib/hive/lib/avro-ipc-1.5.4.jar:/usr/lib/hive/lib/avro-mapred-1.5.4.jar:/usr/lib/hive/lib/commons-cli-1.2.jar:/usr/lib/hive/lib/commons-codec-1.3.jar:/usr/lib/hive/lib/commons-collections-3.2.1.jar:/usr/lib/hive/lib/commons-dbcp-1.4.jar:/usr/lib/hive/lib/commons-lang-2.4.jar:/usr/lib/hive/lib/commons-logging-1.0.4.jar:/usr/lib/hive/lib/commons-logging-api-1.0.4.jar:/usr/lib/hive/lib/commons-pool-1.5.4.jar:/usr/lib/hive/lib/datanucleus-connectionpool-2.0.3.jar:/usr/lib/hive/lib/datanucleus-core-2.0.3-ZD5977-CDH5293.jar:/usr/lib/hive/lib/datanucleus-enhancer-2.0.3.jar:/usr/lib/hive/lib/datanucleus-rdbms-2.0.3.jar:/usr/lib/hive/lib/derby.jar:/usr/lib/hive/lib/guava-r06.jar:/usr/lib/hive/lib/haivvreo-1.0.7-cdh-2.jar:/usr/lib/hive/lib/high-scale-lib-1.1.1.jar:/usr/lib/hive/lib/hive-anttasks-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-cli-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-common-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-contrib-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-exec-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-hbase-handler-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-jdbc-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-metastore-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-serde-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-service-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/hive-shims-0.7.1-cdh3u5.jar:/usr/lib/hive/lib/jackson-core-asl-1.7.3.jar:/usr/lib/hive/lib/jackson-jaxrs-1.7.3.jar:/usr/lib/hive/lib/jackson-mapper-asl-1.7.3.jar:/usr/lib/hive/lib/jackson-xc-1.7.3.jar:/usr/lib/hive/lib/jdo2-api-2.3-ec.jar:/usr/lib/hive/lib/jline-0.9.94.jar:/usr/lib/hive/lib/json.jar:/usr/lib/hive/lib/junit-3.8.1.jar:/usr/lib/hive/lib/libfb303.jar:/usr/lib/hive/lib/libthrift.jar:/usr/lib/hive/lib/log4j-1.2.15.jar:/usr/lib/hive/lib/slf4j-api-1.6.1.jar:/usr/lib/hive/lib/slf4j-log4j12-1.6.1.jar:/usr/lib/hive/lib/snappy-java-1.0.3.2.jar:/usr/lib/hive/lib/stringtemplate-3.1b1.jar:/usr/lib/hive/lib/thrift-0.5.0.jar:/usr/lib/hive/lib/thrift-fb303-0.5.0.jar:/usr/lib/hive/lib/velocity-1.5.jar
>
> which is basically
> echo  bin/pig -Dpig.additional.jars="$PIG_CLASSPATH" "$@"
>
> grunt> A = load 'xxx.yyy' using org.apache.hcatalog.pig.HCatLoader;
> 2012-11-12 18:22:38,398 [main] INFO  hive.metastore - Trying to connect to
> metastore with URI thrift://hcatalog:10002
> 2012-11-12 18:22:38,506 [main] INFO  hive.metastore - Connected to
> metastore.
> grunt> B = FILTER A BY keyword=='FU';
> grunt> ll = LIMIT B 10;
> grunt> dump ll;
>
> 012-11-12 18:22:47,567 [main] INFO
>  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the
> script: FILTER,LIMIT
> 2012-11-12 18:22:47,901 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler -
> File concatenation threshold: 100 optimistic? false
> 2012-11-12 18:22:48,026 [main] INFO
+
Michał Czerwiński 2012-11-13, 15:16
+
Michał Czerwiński 2012-11-13, 15:40
+
Cheolsoo Park 2012-11-13, 17:18
+
Michał Czerwiński 2012-11-13, 17:34
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB