Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Pig APIs


+
Prashant Kommireddi 2013-01-22, 01:26
+
Jonathan Coveney 2013-01-22, 01:30
+
Prashant Kommireddi 2013-01-22, 01:47
+
Prashant Kommireddi 2013-01-22, 22:43
+
Jonathan Coveney 2013-01-22, 23:05
+
Bill Graham 2013-01-23, 05:02
+
Prashant Kommireddi 2013-01-23, 06:04
Sure, here you go:

https://github.com/twitter/ambrose/blob/master/pig/src/main/java/com/twitter/ambrose/pig/AmbrosePigProgressNotificationListener.java

On Tue, Jan 22, 2013 at 10:04 PM, Prashant Kommireddi
<[EMAIL PROTECTED]>wrote:

> Thanks Bill. Can you please point me to the Ambrose code that uses PPNL?
>
> I will open a JIRA for getting hooks with explain in.
>
> Sent from my iPhone
>
> On Jan 22, 2013, at 9:03 PM, Bill Graham <[EMAIL PROTECTED]> wrote:
>
> > Yeah, getting at the info here is tricky. For Ambrose we're getting info
> > about submitted jobs, so we can just hook into the lifecycle of
> > PigProgressNotificationListener. The PPNL notifiers are pretty coupled to
> > PigStatsUtil and ScriptState, which aren't invoked during explain.
> >
> > The bulk of the action for explain all happens in the
> PigServer.explain(..)
> > method. That's where the logical plan, physical plan and execution plan
> are
> > generated before explain gets called on each to print the output. We
> could
> > look to add some sort of listener interface and hook here perhaps that
> gets
> > each of these passed during explain via a configured param.
> >
> >
> >
> > On Tue, Jan 22, 2013 at 3:05 PM, Jonathan Coveney <[EMAIL PROTECTED]
> >wrote:
> >
> >> I think that this is all available, it's just not the easiest thing to
> get
> >> at. If you look at the explain plan, it has a lot of this info, and you
> can
> >> definitely get at that info. I'm not sure if it has the reducers or if
> >> that's post MR setup, but you should be able to.
> >>
> >> That said, I do not think it would hurt to have hooks in to more clearly
> >> do something with this info. Bill had to do stuff like this for
> Ambrose, so
> >> maybe he can weigh in on what that could look like.
> >>
> >>
> >> 2013/1/22 Prashant Kommireddi <[EMAIL PROTECTED]>
> >>
> >>> Jon/others - any pointers on this? I would like to patch in hooks if
> this
> >>> is not possible at the moment.
> >>>
> >>> -Prashant
> >>>
> >>> On Mon, Jan 21, 2013 at 5:47 PM, Prashant Kommireddi <
> [EMAIL PROTECTED]
> >>>> wrote:
> >>>
> >>>> At the moment, basically info on I/O paths, operators used (group by,
> >>>> foreach ..), job level info such as number of reducers etc.
> >>>>
> >>>>
> >>>> On Mon, Jan 21, 2013 at 5:30 PM, Jonathan Coveney <[EMAIL PROTECTED]
> >>>> wrote:
> >>>>
> >>>>> What level of information would you like? IE if you do "explain
> >>> relation,"
> >>>>> which of the three do you want to hook into?
> >>>>>
> >>>>>
> >>>>> 2013/1/21 Prashant Kommireddi <[EMAIL PROTECTED]>
> >>>>>
> >>>>>> Been coding with the APIs and wondering if there is anything that
> >>> allows
> >>>>>> you to only retrieve the operators, I/O paths etc without actually
> >>>>> issuing
> >>>>>> an execute or a store? Basically, being able to get information
> >>>>>> post-parsing of the script but pre-execution.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Prashant
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >
> >
> > --
>
+
Prashant Kommireddi 2013-01-23, 07:25