Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Set visible name of a running pig job


Copy link to this message
-
RE: Set visible name of a running pig job
Thanks Jonathan.  I've seen other references to using -D... on the command line, but I haven't had success with it.  I tried
  pig -param a=b -Dmapred.job.name=whatever  myscript.pig
and the script failed and I got a usage message

Apache Pig version 0.8.1 (r1094835)
compiled Apr 18 2011, 19:26:53

USAGE: Pig [options] [-] : Run interactively in grunt shell.
       Pig [options] -e[xecute] cmd [cmd ...] : Run cmd(s).
       Pig [options] [-f[ile]] file : Run cmds found in file.
  options include:
    -4, -log4jconf - Log4j configuration file, overrides log conf
  [...]

The usage message doesn't mention -D.

I think there's a bug in command line processing: when I reversed the order of the -param and -D, then I did not get the usage message, and my script ran.

But, mapred.job.name did not get passed through to hadoop.  The configuration reported by the jobtracker shows the default name for the job, PigLatin:myscript.pig, not the string following the -D.

So that looks like a different bug.  I have not tried the -propertyFile switch.  What would be the format of entries in that file:
  a  b
  a = b
  <xml>???</>

Thanks again,
Will

-----Original Message-----
From: Jonathan Coveney [mailto:[EMAIL PROTECTED]]
Sent: Thursday, May 26, 2011 4:47 PM
To: [EMAIL PROTECTED]
Subject: Re: Set visible name of a running pig job

Another option is to use  -Dmapred.job.name=whatever on the command line.

2011/5/26 <[EMAIL PROTECTED]>

> Thanks Eric and Mark.
>
> Now I see that job.name is documented in
> http://pig.apache.org/docs/r0.8.1/piglatin_ref2.html#set  (duh).  It also
> says there "All Pig and Hadoop properties can be set."
>
> Trying to figure out what exactly those properties are (is there a list
> someplace?) I looked at my job configuration at
>
> http://myserver:50030/jobconf.jsp?jobid=job_2011999999_9999
>
> where I now see 'mapred.job.name' --> PigLatin:hello
>
> after I set job.name to 'hello'.
>
> But
> SET mapred.job.name hello
> doesn't seem to have any effect at all.
>
> So I am confused about which properties I can set, and how to refer to
> them.  Is there a doc or wiki page someplace that explains this to a
> pig/hadoop novice?
>
> Thanks again for your help.
>
> William F Dowling
>
> -----Original Message-----
> From: Mark Laczin [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, May 26, 2011 2:23 PM
> To: [EMAIL PROTECTED]
> Subject: Re: Set visible name of a running pig job
>
> This will work but will (in .80 at least) change only the part of the
> job name that's not 'PigLatin:'.
>
> That is, if you use job.name 'hello' in a script named test.pig
> You end up with a full name of:
>
> PigLatin:hello
>
> Instead of PigLatin:test.pig
>
> Just FYI.
>
> On Thu, May 26, 2011 at 2:15 PM, Eric Gaudet <[EMAIL PROTECTED]> wrote:
> > At the beginning of your script, use:
> >
> > SET job.name 'this is my alternative name';
> >
> > You can also use parameters like $PARAM in the name.
> >
> > EG
> >
> > On 05/26/2011 11:04 AM, [EMAIL PROTECTED] wrote:
> >>
> >> When I run a pig job the hadoop job tracker gui (the one on port 50030)
> >> shows ‘PigLatin:myscript.pig’ as the name of the job.  How can I
> configure
> >> that to show a different name than the name of the script?
> >>
> >> Thanks in advance,
> >>
> >> Will
> >>
> >> William F Dowling
> >>
> >>
> >>
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB