Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Tracking parts of a job taking the most time


+
John Meek 2013-06-05, 02:11
+
Johnny Zhang 2013-06-05, 06:15
+
Ruslan Al-Fakikh 2013-06-05, 08:03
+
John Meek 2013-06-05, 10:29
Copy link to this message
-
Re: Tracking parts of a job taking the most time
John,

I think this is the translation of DAG
http://en.wikipedia.org/wiki/Directed_acyclic_graph Anyway, what I meant
was the list of the generated MR jobs. When you launch a Pig script via
command line you get something like this:
INFO... job url... http://yourcluster:...jobid
every time an MR job is launched.

Then, when the job is finished, you get the full list of jobid's, something
like:
Job DAG:
job_201304081613_0032   ->      job_201304081613_0033,
job_201304081613_0033   ->      job_201304081613_0034,
job_201304081613_0034   ->    ...

Let me know if you have further questions
On Wed, Jun 5, 2013 at 2:29 PM, John Meek <[EMAIL PROTECTED]> wrote:

> hi Ruslan ,
> Not sure how to do this? Can you be specific?? Whats DAG? Thanks.
>
>
>
>
>
>
>
> -----Original Message-----
> From: Ruslan Al-Fakikh <[EMAIL PROTECTED]>
> To: user <[EMAIL PROTECTED]>
> Sent: Wed, Jun 5, 2013 4:04 am
> Subject: Re: Tracking parts of a job taking the most time
>
>
> Hi!
>
> You can look at the Pig script stats after the script is finished. There is
> a DAG of MR jobs there. You can look at the individual MR jobs' stats to
> see how much time each MR job takes
>
> Ruslan
>
>
> On Wed, Jun 5, 2013 at 10:15 AM, Johnny Zhang <[EMAIL PROTECTED]>
> wrote:
>
> > How about disable multi-query execution and use UDF CurrentTime to print
> > time between each script block?
> >
> > Johnny
> >
> >
> > On Tue, Jun 4, 2013 at 7:11 PM, John Meek <[EMAIL PROTECTED]> wrote:
> >
> > > All,
> > >
> > > I have a 400 line pig script which perfoems the calculations I need it
> to
> > > perform, however I need to figure out the amount of time that specific
> > > parts of the script take.
> > >
> > > For example, initial load from a Hbase table - id like to know how much
> > > time the load takes before moving onto the next step.
> > >
> > > Whats the easiest way to break this down?
> > >
> > >
> > > thanks,
> > > JM
> > >
> >
>
>
>
+
Pradeep Gollakota 2013-06-06, 11:22
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB