Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Pig script only works with no_multiquery

Dragos Munteanu 2011-02-18, 20:26
Copy link to this message
Re: Pig script only works with no_multiquery
Hi Dragos,
You might be facing this issue -
https://issues.apache.org/jira/browse/PIG-1815, it has been resolved in pig
0.8 branch after the official release.
We are likely to release a new 0.8 patch (pending discussion) with the
fixes. Does your pig jar have this fix ?
If not , can you please try building with
http://svn.apache.org/repos/asf/pig/branches/branch-0.8 and try again with
the new jar?
On 2/18/11 12:26 PM, "Dragos Munteanu" <[EMAIL PROTECTED]> wrote:

> Hi all,
> I have a Pig script that only runs if I turn on "-no_multiquery".
> My questions are:
> - is it expected that Pig's multiquery execution would create enough of an
> overhead that the execution should fail?

It is not expected to fail.

> - can someone explain (or point me to an explanation) of where the
> multiquery overhead comes from? I'd really like to understand it

In case of multi-query you end up doing more computation per task, so an
issue such as one PIG-1815 might not be causing failures in the non
multiquery case. Also PIG-1815 is caused by physical plan copies not being
freed and multi-query physical plan will be larger.

> - is there a better way to write the pig code to do that computation? Maybe
> I can re-structure my computation, or configure my cluster differently? Or
> am I stuck with a no_multiquery execution?

If your query does not work with latest from 0.8 branch, please let us know.
Dragos Munteanu 2011-02-22, 23:53
Thejas M Nair 2011-02-24, 00:15