Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> PigServer not connecting to HDFS?


+
Zach Bailey 2010-10-27, 23:25
+
Jeff Zhang 2010-10-28, 01:08
Copy link to this message
-
Re: PigServer not connecting to HDFS?
Pig needs to know where your HDFS is, doesn't it? :)
http://pig.apache.org/docs/r0.7.0/setup.html#Embedded+Programs details
on what needs to be set for embedded programs to use Pig.
Specifically, the $HADOOPDIR part.

You could also put the conf files into the classpath as Jeff pointed :)

On Thu, Oct 28, 2010 at 4:55 AM, Zach Bailey <[EMAIL PROTECTED]> wrote:
>
>                        Hi all,Facing a weird problem and wondering if anyone has run into this before. I've been playing with PigServer to programmatically run some simple pig scripts and it does not seem to be connecting to HDFS when I pass in ExecType.MAPREDUCE.I am running in pseudo-distributed mode and have the tasktracker and namenode both running on default ports. When I run scripts by using "pig script.pig" or from the grunt console it connects to hdfs and works fine.Do I need to specify some additional properties in the PigServer constructor, or construct a custom PigContext? I had assumed that by passing ExecType.MAPREDUCE and using the defaults, everything would be fine.Would really appreciate any insight or anecdotes of others using PigServer and how they have it set up. Thanks a bunch!-ZachHere is the code I'm using:PigServer pigServer = new PigServer("mapreduce");pigServer.setBatchOn();pigServer.registerScript("/Users/zach/Desktop/test.pig");List<ExecJob> jobs = pigServer.executeBat
>  ch();and
>  here is the log output:0    [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine  - Connecting to hadoop file system at: file:///622  [main] INFO  org.apache.pig.impl.logicalLayer.optimizer.PruneColumns  - No column pruned for pages622  [main] INFO  org.apache.pig.impl.logicalLayer.optimizer.PruneColumns  - No map keys pruned for pages659  [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics  - Initializing JVM Metrics with processName=JobTracker, sessionId=751  [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine  - (Name: Store(file:///output:PigStorage) - 1-70 Operator Key: 1-70)789  [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer  - MR plan size before optimization: 1790  [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer  - MR plan size after optimization: 1815  [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetri
>  cs  - C
> annot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized822  [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics  - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized822  [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler  - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.32534 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler  - Setting up single store job2582 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics  - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized2582 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  - 1 map-reduce job(s) waiting for submission.2590 [Thread-4] WARN  org.apache.hadoop.mapred.JobClient  - Use GenericOptionsParser for parsing the arguments. Applications should imp
>  lement T
> ool for the same.2746 [Thread-4] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics  - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized2765 [Thread-4] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics  - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized3083 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  - 0% complete3084 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  - 100% complete3084 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  - 1 map reduce job(s) failed!3085 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher  - There is no log file to write to.3085 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher  - Backend error message during job submissionorg.apache.pig.backend.executionengine

Harsh J
www.harshj.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB