|
|
-
running pig on remote cluster
Stan Rosenberg 2012-06-08, 21:08
Hi,
I am trying to submit a pig job to a remote cluster by setting mapred.job.tracker and fs.default.name accordingly. The job does get executed on the remote cluster, however all intermediate output is stored on the local cluster from which pig is run. From job configuration I can see that that pig.reduce.output.dirs and pig.streaming.log.dir are referencing the local cluster. I am supposed to set these manually or is there an alternative?
pig -version Apache Pig version 0.10.0 (r1328203) compiled Apr 19 2012, 22:54:12
Thanks,
stan
-
RE: running pig on remote cluster
rakesh sharma 2012-06-10, 07:23
I also would like to hear from the experts as I am also facing the same problem. Thanks,Rakesh
> Date: Fri, 8 Jun 2012 17:08:36 -0400 > Subject: running pig on remote cluster > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED] > > Hi, > > I am trying to submit a pig job to a remote cluster by setting > mapred.job.tracker and fs.default.name accordingly. > The job does get executed on the remote cluster, however all > intermediate output is stored on the local cluster from which > pig is run. From job configuration I can see that that > pig.reduce.output.dirs and pig.streaming.log.dir are referencing the > local cluster. > I am supposed to set these manually or is there an alternative? > > pig -version > Apache Pig version 0.10.0 (r1328203) > compiled Apr 19 2012, 22:54:12 > > Thanks, > > stan
-
Re: running pig on remote cluster
Alex Rovner 2012-06-12, 22:58
Make sure your output path has the full uri including the namenode and port information. Example: instead of /tmp/output Hdfs://Namenode:port/tmp/output.
Sent from my iPhone
On Jun 10, 2012, at 3:23 AM, rakesh sharma <[EMAIL PROTECTED]> wrote:
> > I also would like to hear from the experts as I am also facing the same problem. > Thanks,Rakesh > >> Date: Fri, 8 Jun 2012 17:08:36 -0400 >> Subject: running pig on remote cluster >> From: [EMAIL PROTECTED] >> To: [EMAIL PROTECTED] >> >> Hi, >> >> I am trying to submit a pig job to a remote cluster by setting >> mapred.job.tracker and fs.default.name accordingly. >> The job does get executed on the remote cluster, however all >> intermediate output is stored on the local cluster from which >> pig is run. From job configuration I can see that that >> pig.reduce.output.dirs and pig.streaming.log.dir are referencing the >> local cluster. >> I am supposed to set these manually or is there an alternative? >> >> pig -version >> Apache Pig version 0.10.0 (r1328203) >> compiled Apr 19 2012, 22:54:12 >> >> Thanks, >> >> stan >
|
|
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by
Sematext