Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # dev >> Re: PigServer API


+
Bill Graham 2012-10-10, 17:28
+
Prashant Kommireddi 2012-10-10, 18:00
+
Bill Graham 2012-10-11, 00:54
+
Prashant Kommireddi 2012-10-11, 09:12
Copy link to this message
-
Re: PigServer API
Doesn't executeBatch() return exactly what you want?

On Thu, Oct 11, 2012 at 2:12 AM, Prashant Kommireddi
<[EMAIL PROTECTED]> wrote:
> I knew I had those negotiation skills :)
>
> Patch is available, please review. It's a minor one
> https://issues.apache.org/jira/browse/PIG-2964
>
> -Prashant
>
> On Wed, Oct 10, 2012 at 5:54 PM, Bill Graham <[EMAIL PROTECTED]> wrote:
>
>> Ok, I'm sold. :)
>>
>>
>> On Wed, Oct 10, 2012 at 11:00 AM, Prashant Kommireddi <[EMAIL PROTECTED]
>> > wrote:
>>
>>> Thanks Bill.
>>>
>>> The rationale behind providing a List is that it simply provides a lot
>>> more methods than an iterator. You are right in saying one could do that in
>>> the caller code, I have a feeling providing this helper in the API would be
>>> beneficial. For eg, a framework that is used by clients could initiate
>>> several pig scripts/store commands at once. At the framework layer, you
>>> might want to be able to determine the number of MR jobs in total spawned
>>> by these multiple scripts and query stats on those. That's just one
>>> use-case, there could be more methods on List that a user could be
>>> interested in.
>>>
>>> -Prashant
>>>
>>>
>>> On Wed, Oct 10, 2012 at 10:28 AM, Bill Graham <[EMAIL PROTECTED]>wrote:
>>>
>>>> Hi Prashant,
>>>>
>>>> [Replying to the dev list to get others take on these...]
>>>>
>>>> Just curious, why do you prefer a List of JobStats over the already
>>>> existing iterator? I hesitate to add one-liner methods if it's something
>>>> that can be a one-liner my the caller, unless the use case if very common.
>>>>
>>>> Making getSuccessfulJobs() and getFailedJobs() public seems reasonable
>>>> to me.
>>>>
>>>> I'm not sure about the rationale behind the differences between
>>>> registerScript and store(). Store() and registerQuery() are able to
>>>> manually add to the DAG as statements come in, but register script needs
>>>> parsing for execution. That's probably why execution is delegated to the
>>>> GruntParser. The resulting DAG for a single-store script should be the same
>>>> though. It seems like registerScript() should be able to return a list of
>>>> ExecJobs.
>>>>
>>>> thanks,
>>>> Bill
>>>>
>>>>
>>>> On Tue, Oct 9, 2012 at 11:22 PM, Prashant Kommireddi <
>>>> [EMAIL PROTECTED]> wrote:
>>>>
>>>>> Hi Bill,
>>>>>
>>>>> I am looking at PigStats and JobGraph, and am thinking of adding some
>>>>> functions. Let me know what you think.
>>>>>
>>>>> *getJobList()* returns a List representation of the iterator.
>>>>>
>>>>> public List<JobStats> getJobList() {
>>>>>             return IteratorUtils.toList(iterator());
>>>>> }
>>>>>
>>>>> What do you think about making getSuccessfulJobs() and getFailedJobs()
>>>>> public and exposing it to the API? Currently they are package-private?
>>>>>
>>>>> Had another question, seems like the execution flow for
>>>>> PigServer.registerScript/Query is different from PigServer.store(). Was
>>>>> there a reason to make these different? The function store() returns an
>>>>> ExecJob which is great to get info regarding the runs, but registerScript()
>>>>> calls the GruntParser for execution which I think is a different flow?
>>>>>
>>>>> Thanks,
>>>>> Prashant
>>>>>
>>>>>
>>>>> On Thu, Oct 4, 2012 at 6:05 PM, Bill Graham <[EMAIL PROTECTED]>wrote:
>>>>>
>>>>>> Makes sense to me. We could return a PigStats object.
>>>>>>
>>>>>> On Thu, Oct 4, 2012 at 1:49 PM, Prashant Kommireddi <
>>>>>> [EMAIL PROTECTED]>wrote:
>>>>>>
>>>>>> > Hi All,
>>>>>> >
>>>>>> > I am looking at PigServer methods for running scripts/queries and it
>>>>>> seems
>>>>>> > like currently theie return type is void which does not tell much
>>>>>> about job
>>>>>> > completion.
>>>>>> >
>>>>>> >     public void registerScript(InputStream in, Map<String,String>
>>>>>> > params,List<String> paramsFiles) throws IOException {
>>>>>> >         try {
>>>>>> >             String substituted = doParamSubstitution(in, params,
>>>>>> > paramsFiles);
>>>>>> >             GruntParser grunt = new GruntParser(new
+
Prashant Kommireddi 2012-10-11, 19:54
+
Dmitriy Ryaboy 2012-10-11, 21:28
+
Dmitriy Ryaboy 2012-10-11, 21:28