Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Pig APIs


Thanks Bill. Can you please point me to the Ambrose code that uses PPNL?

I will open a JIRA for getting hooks with explain in.

Sent from my iPhone

On Jan 22, 2013, at 9:03 PM, Bill Graham <[EMAIL PROTECTED]> wrote:

> Yeah, getting at the info here is tricky. For Ambrose we're getting info
> about submitted jobs, so we can just hook into the lifecycle of
> PigProgressNotificationListener. The PPNL notifiers are pretty coupled to
> PigStatsUtil and ScriptState, which aren't invoked during explain.
>
> The bulk of the action for explain all happens in the PigServer.explain(..)
> method. That's where the logical plan, physical plan and execution plan are
> generated before explain gets called on each to print the output. We could
> look to add some sort of listener interface and hook here perhaps that gets
> each of these passed during explain via a configured param.
>
>
>
> On Tue, Jan 22, 2013 at 3:05 PM, Jonathan Coveney <[EMAIL PROTECTED]>wrote:
>
>> I think that this is all available, it's just not the easiest thing to get
>> at. If you look at the explain plan, it has a lot of this info, and you can
>> definitely get at that info. I'm not sure if it has the reducers or if
>> that's post MR setup, but you should be able to.
>>
>> That said, I do not think it would hurt to have hooks in to more clearly
>> do something with this info. Bill had to do stuff like this for Ambrose, so
>> maybe he can weigh in on what that could look like.
>>
>>
>> 2013/1/22 Prashant Kommireddi <[EMAIL PROTECTED]>
>>
>>> Jon/others - any pointers on this? I would like to patch in hooks if this
>>> is not possible at the moment.
>>>
>>> -Prashant
>>>
>>> On Mon, Jan 21, 2013 at 5:47 PM, Prashant Kommireddi <[EMAIL PROTECTED]
>>>> wrote:
>>>
>>>> At the moment, basically info on I/O paths, operators used (group by,
>>>> foreach ..), job level info such as number of reducers etc.
>>>>
>>>>
>>>> On Mon, Jan 21, 2013 at 5:30 PM, Jonathan Coveney <[EMAIL PROTECTED]
>>>> wrote:
>>>>
>>>>> What level of information would you like? IE if you do "explain
>>> relation,"
>>>>> which of the three do you want to hook into?
>>>>>
>>>>>
>>>>> 2013/1/21 Prashant Kommireddi <[EMAIL PROTECTED]>
>>>>>
>>>>>> Been coding with the APIs and wondering if there is anything that
>>> allows
>>>>>> you to only retrieve the operators, I/O paths etc without actually
>>>>> issuing
>>>>>> an execute or a store? Basically, being able to get information
>>>>>> post-parsing of the script but pre-execution.
>>>>>>
>>>>>> Thanks,
>>>>>> Prashant
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>
>
> --
> *Note that I'm no longer using my Yahoo! email address. Please email me at
> [EMAIL PROTECTED] going forward.*