Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - MULTI_LEAF_MAP and DID_NOT_FIND_LOAD_ONLY_MAP_PLAN


Copy link to this message
-
Re: MULTI_LEAF_MAP and DID_NOT_FIND_LOAD_ONLY_MAP_PLAN
Ashutosh Chauhan 2010-04-07, 23:27
Hi Eric,

What you wrote is mostly true. A distilled down script generating same
warnings will be helpful. Thing is it seems that invariants which
should always hold (in theory) on generated map-reduce plan from query
are getting violated. I would like to see the script which can make
this happen. This might be a bug.

Ashutosh

On Wed, Mar 31, 2010 at 13:36, Eric Tschetter <[EMAIL PROTECTED]> wrote:
> Ok, so if I'm understanding you correctly, the warnings are basically
> Pig saying that it cannot apply a specific optimization, is that
> correct?  If so, then might I make a suggestion that they not show up
> as "WARNING" but as "INFO" or something.  Also, a slightly more
> meaningful message (something like, "Optimization XX was not able to
> be applied because of YY") might be helpful.  It also sounds like I
> can safely ignore them as they won't affect the meaning of the pig
> script, just how it is actually executed.
>
> If this is wrong, I can try to distill down the script and get the
> same warnings.
>
> --Eric
>
>
>
> On Tue, Mar 30, 2010 at 10:37 PM, Ashutosh Chauhan
> <[EMAIL PROTECTED]> wrote:
>> After Pig compiles a query into a series of map-reduce plans, it once
>> again iterate through those jobs trying to spot the opportunities of
>> further optimizations. While visiting such compiled plan, there are
>> certain invariants which must always hold. If Pig finds contrary to
>> it, it backs out and doesn't try to apply that optimization step. It
>> seems that you are running into that situation.
>> How this affects computation of your query I am not sure. Can you
>> paste stripped down version of your query which generates these
>> warning messages?
>>
>> Ashutosh
>>
>> On Fri, Mar 26, 2010 at 10:55, Eric Tschetter <[EMAIL PROTECTED]> wrote:
>>> I'm writing a fairly involved pig script to do some data munging and
>>> after all was said and done, I end up getting the warnings mentioned
>>> in the subject.
>>>
>>> The grunt output is actually:
>>>
>>> 2010-03-26 09:42:33,245 [main] WARN
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>>> - Encountered Warning DID_NOT_FIND_LOAD_ONLY_MAP_PLAN 2 time(s).
>>> 2010-03-26 09:42:33,250 [main] WARN
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>>> - Encountered Warning MULTI_LEAF_MAP 1 time(s).
>>>
>>> This is on pig 0.6.0.  I tried to find a document that can explain
>>> what these mean, but St. Google wasn't of much help.  So, I looked at
>>> the code to try and decipher what they would mean and from what I can
>>> tell
>>>
>>> "DID_NOT_FIND_LOAD_ONLY_MAP_PLAN" means that there is a map plan that
>>> meets one of the following conditions
>>>
>>> 1) has no leaf
>>> 2) has more than one leaf
>>> 3) has no root
>>> 4) has more than one root
>>> 5) the leaf is not equivalent to the root
>>>
>>> "MULTI_LEAF_MAP" apparently means that there are either zero or >=2
>>> leaves on a map plan.
>>>
>>> So, I'm guessing that the DID_NOT_FIND_LOAD_ONLY_MAP_PLAN warnings are
>>> coming from having 2 leaves.  Unfortunately, I have no clue what that
>>> means.  Can anyone shed some light on what these warnings mean, if I
>>> need to care about them and what I should look for to get rid of them
>>> (if they are, in fact, not good)?
>>>
>>> Also, if the meaning of these warnings is documented somewhere and
>>> Google just couldn't find it, please point me to the documentation.
>>>
>>> --Eric Tschetter
>>>
>>
>