Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Re: PigServer API


True, that does what would serve the purpose. However, I feel the
abstraction could be at a lower level so callers of other functions such as
"store" could use it too.

On Thu, Oct 11, 2012 at 12:27 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote:

> Doesn't executeBatch() return exactly what you want?
>
>
>
> On Thu, Oct 11, 2012 at 2:12 AM, Prashant Kommireddi
> <[EMAIL PROTECTED]> wrote:
> > I knew I had those negotiation skills :)
> >
> > Patch is available, please review. It's a minor one
> > https://issues.apache.org/jira/browse/PIG-2964
> >
> > -Prashant
> >
> > On Wed, Oct 10, 2012 at 5:54 PM, Bill Graham <[EMAIL PROTECTED]>
> wrote:
> >
> >> Ok, I'm sold. :)
> >>
> >>
> >> On Wed, Oct 10, 2012 at 11:00 AM, Prashant Kommireddi <
> [EMAIL PROTECTED]
> >> > wrote:
> >>
> >>> Thanks Bill.
> >>>
> >>> The rationale behind providing a List is that it simply provides a lot
> >>> more methods than an iterator. You are right in saying one could do
> that in
> >>> the caller code, I have a feeling providing this helper in the API
> would be
> >>> beneficial. For eg, a framework that is used by clients could initiate
> >>> several pig scripts/store commands at once. At the framework layer, you
> >>> might want to be able to determine the number of MR jobs in total
> spawned
> >>> by these multiple scripts and query stats on those. That's just one
> >>> use-case, there could be more methods on List that a user could be
> >>> interested in.
> >>>
> >>> -Prashant
> >>>
> >>>
> >>> On Wed, Oct 10, 2012 at 10:28 AM, Bill Graham <[EMAIL PROTECTED]
> >wrote:
> >>>
> >>>> Hi Prashant,
> >>>>
> >>>> [Replying to the dev list to get others take on these...]
> >>>>
> >>>> Just curious, why do you prefer a List of JobStats over the already
> >>>> existing iterator? I hesitate to add one-liner methods if it's
> something
> >>>> that can be a one-liner my the caller, unless the use case if very
> common.
> >>>>
> >>>> Making getSuccessfulJobs() and getFailedJobs() public seems reasonable
> >>>> to me.
> >>>>
> >>>> I'm not sure about the rationale behind the differences between
> >>>> registerScript and store(). Store() and registerQuery() are able to
> >>>> manually add to the DAG as statements come in, but register script
> needs
> >>>> parsing for execution. That's probably why execution is delegated to
> the
> >>>> GruntParser. The resulting DAG for a single-store script should be
> the same
> >>>> though. It seems like registerScript() should be able to return a
> list of
> >>>> ExecJobs.
> >>>>
> >>>> thanks,
> >>>> Bill
> >>>>
> >>>>
> >>>> On Tue, Oct 9, 2012 at 11:22 PM, Prashant Kommireddi <
> >>>> [EMAIL PROTECTED]> wrote:
> >>>>
> >>>>> Hi Bill,
> >>>>>
> >>>>> I am looking at PigStats and JobGraph, and am thinking of adding some
> >>>>> functions. Let me know what you think.
> >>>>>
> >>>>> *getJobList()* returns a List representation of the iterator.
> >>>>>
> >>>>> public List<JobStats> getJobList() {
> >>>>>             return IteratorUtils.toList(iterator());
> >>>>> }
> >>>>>
> >>>>> What do you think about making getSuccessfulJobs() and
> getFailedJobs()
> >>>>> public and exposing it to the API? Currently they are
> package-private?
> >>>>>
> >>>>> Had another question, seems like the execution flow for
> >>>>> PigServer.registerScript/Query is different from PigServer.store().
> Was
> >>>>> there a reason to make these different? The function store() returns
> an
> >>>>> ExecJob which is great to get info regarding the runs, but
> registerScript()
> >>>>> calls the GruntParser for execution which I think is a different
> flow?
> >>>>>
> >>>>> Thanks,
> >>>>> Prashant
> >>>>>
> >>>>>
> >>>>> On Thu, Oct 4, 2012 at 6:05 PM, Bill Graham <[EMAIL PROTECTED]
> >wrote:
> >>>>>
> >>>>>> Makes sense to me. We could return a PigStats object.
> >>>>>>
> >>>>>> On Thu, Oct 4, 2012 at 1:49 PM, Prashant Kommireddi <
> >>>>>> [EMAIL PROTECTED]>wrote:
> >>>>>>
> >>>>>> > Hi All,
> >>>>>> >
> >>>>>> > I am looking at PigServer methods for running scripts/queries and
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB