Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # dev - Re: PigServer API


Copy link to this message
-
Re: PigServer API
Prashant Kommireddi 2012-10-11, 19:54
True, that does what would serve the purpose. However, I feel the
abstraction could be at a lower level so callers of other functions such as
"store" could use it too.

On Thu, Oct 11, 2012 at 12:27 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote:

> Doesn't executeBatch() return exactly what you want?
>
>
>
> On Thu, Oct 11, 2012 at 2:12 AM, Prashant Kommireddi
> <[EMAIL PROTECTED]> wrote:
> > I knew I had those negotiation skills :)
> >
> > Patch is available, please review. It's a minor one
> > https://issues.apache.org/jira/browse/PIG-2964
> >
> > -Prashant
> >
> > On Wed, Oct 10, 2012 at 5:54 PM, Bill Graham <[EMAIL PROTECTED]>
> wrote:
> >
> >> Ok, I'm sold. :)
> >>
> >>
> >> On Wed, Oct 10, 2012 at 11:00 AM, Prashant Kommireddi <
> [EMAIL PROTECTED]
> >> > wrote:
> >>
> >>> Thanks Bill.
> >>>
> >>> The rationale behind providing a List is that it simply provides a lot
> >>> more methods than an iterator. You are right in saying one could do
> that in
> >>> the caller code, I have a feeling providing this helper in the API
> would be
> >>> beneficial. For eg, a framework that is used by clients could initiate
> >>> several pig scripts/store commands at once. At the framework layer, you
> >>> might want to be able to determine the number of MR jobs in total
> spawned
> >>> by these multiple scripts and query stats on those. That's just one
> >>> use-case, there could be more methods on List that a user could be
> >>> interested in.
> >>>
> >>> -Prashant
> >>>
> >>>
> >>> On Wed, Oct 10, 2012 at 10:28 AM, Bill Graham <[EMAIL PROTECTED]
> >wrote:
> >>>
> >>>> Hi Prashant,
> >>>>
> >>>> [Replying to the dev list to get others take on these...]
> >>>>
> >>>> Just curious, why do you prefer a List of JobStats over the already
> >>>> existing iterator? I hesitate to add one-liner methods if it's
> something
> >>>> that can be a one-liner my the caller, unless the use case if very
> common.
> >>>>
> >>>> Making getSuccessfulJobs() and getFailedJobs() public seems reasonable
> >>>> to me.
> >>>>
> >>>> I'm not sure about the rationale behind the differences between
> >>>> registerScript and store(). Store() and registerQuery() are able to
> >>>> manually add to the DAG as statements come in, but register script
> needs
> >>>> parsing for execution. That's probably why execution is delegated to
> the
> >>>> GruntParser. The resulting DAG for a single-store script should be
> the same
> >>>> though. It seems like registerScript() should be able to return a
> list of
> >>>> ExecJobs.
> >>>>
> >>>> thanks,
> >>>> Bill
> >>>>
> >>>>
> >>>> On Tue, Oct 9, 2012 at 11:22 PM, Prashant Kommireddi <
> >>>> [EMAIL PROTECTED]> wrote:
> >>>>
> >>>>> Hi Bill,
> >>>>>
> >>>>> I am looking at PigStats and JobGraph, and am thinking of adding some
> >>>>> functions. Let me know what you think.
> >>>>>
> >>>>> *getJobList()* returns a List representation of the iterator.
> >>>>>
> >>>>> public List<JobStats> getJobList() {
> >>>>>             return IteratorUtils.toList(iterator());
> >>>>> }
> >>>>>
> >>>>> What do you think about making getSuccessfulJobs() and
> getFailedJobs()
> >>>>> public and exposing it to the API? Currently they are
> package-private?
> >>>>>
> >>>>> Had another question, seems like the execution flow for
> >>>>> PigServer.registerScript/Query is different from PigServer.store().
> Was
> >>>>> there a reason to make these different? The function store() returns
> an
> >>>>> ExecJob which is great to get info regarding the runs, but
> registerScript()
> >>>>> calls the GruntParser for execution which I think is a different
> flow?
> >>>>>
> >>>>> Thanks,
> >>>>> Prashant
> >>>>>
> >>>>>
> >>>>> On Thu, Oct 4, 2012 at 6:05 PM, Bill Graham <[EMAIL PROTECTED]
> >wrote:
> >>>>>
> >>>>>> Makes sense to me. We could return a PigStats object.
> >>>>>>
> >>>>>> On Thu, Oct 4, 2012 at 1:49 PM, Prashant Kommireddi <
> >>>>>> [EMAIL PROTECTED]>wrote:
> >>>>>>
> >>>>>> > Hi All,
> >>>>>> >
> >>>>>> > I am looking at PigServer methods for running scripts/queries and