Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> MULTI_LEAF_MAP and DID_NOT_FIND_LOAD_ONLY_MAP_PLAN


Copy link to this message
-
Re: MULTI_LEAF_MAP and DID_NOT_FIND_LOAD_ONLY_MAP_PLAN
Also, which pig version are you seeing this happening ? Is it possible
for you to try your script on trunk and see if you still get the
warnings.

Ashutosh

On Wed, Apr 7, 2010 at 18:41, Eric Tschetter <[EMAIL PROTECTED]> wrote:
> Ok, I'll try to get one.  I'm not sure what exact part of the script
> is doing it so it might take some time.  In my wanderings around the
> interwebs trying to figure out what the warnings meant, however, there
> was someone else who posted to this (I think) list with a script that
> produced the warnings:
>
> http://www.search-hadoop.com/m?[EMAIL PROTECTED]||
>
> It looks like I'm doing something similar to him.  I'll try to see if
> I can figure out what part of my script is causing it to happen and
> then just make up something that also causes it though.
>
> --Eric
>
>
> On Wed, Apr 7, 2010 at 4:27 PM, Ashutosh Chauhan
> <[EMAIL PROTECTED]> wrote:
>> Hi Eric,
>>
>> What you wrote is mostly true. A distilled down script generating same
>> warnings will be helpful. Thing is it seems that invariants which
>> should always hold (in theory) on generated map-reduce plan from query
>> are getting violated. I would like to see the script which can make
>> this happen. This might be a bug.
>>
>> Ashutosh
>>
>> On Wed, Mar 31, 2010 at 13:36, Eric Tschetter <[EMAIL PROTECTED]> wrote:
>>> Ok, so if I'm understanding you correctly, the warnings are basically
>>> Pig saying that it cannot apply a specific optimization, is that
>>> correct?  If so, then might I make a suggestion that they not show up
>>> as "WARNING" but as "INFO" or something.  Also, a slightly more
>>> meaningful message (something like, "Optimization XX was not able to
>>> be applied because of YY") might be helpful.  It also sounds like I
>>> can safely ignore them as they won't affect the meaning of the pig
>>> script, just how it is actually executed.
>>>
>>> If this is wrong, I can try to distill down the script and get the
>>> same warnings.
>>>
>>> --Eric
>>>
>>>
>>>
>>> On Tue, Mar 30, 2010 at 10:37 PM, Ashutosh Chauhan
>>> <[EMAIL PROTECTED]> wrote:
>>>> After Pig compiles a query into a series of map-reduce plans, it once
>>>> again iterate through those jobs trying to spot the opportunities of
>>>> further optimizations. While visiting such compiled plan, there are
>>>> certain invariants which must always hold. If Pig finds contrary to
>>>> it, it backs out and doesn't try to apply that optimization step. It
>>>> seems that you are running into that situation.
>>>> How this affects computation of your query I am not sure. Can you
>>>> paste stripped down version of your query which generates these
>>>> warning messages?
>>>>
>>>> Ashutosh
>>>>
>>>> On Fri, Mar 26, 2010 at 10:55, Eric Tschetter <[EMAIL PROTECTED]> wrote:
>>>>> I'm writing a fairly involved pig script to do some data munging and
>>>>> after all was said and done, I end up getting the warnings mentioned
>>>>> in the subject.
>>>>>
>>>>> The grunt output is actually:
>>>>>
>>>>> 2010-03-26 09:42:33,245 [main] WARN
>>>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>>>>> - Encountered Warning DID_NOT_FIND_LOAD_ONLY_MAP_PLAN 2 time(s).
>>>>> 2010-03-26 09:42:33,250 [main] WARN
>>>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>>>>> - Encountered Warning MULTI_LEAF_MAP 1 time(s).
>>>>>
>>>>> This is on pig 0.6.0.  I tried to find a document that can explain
>>>>> what these mean, but St. Google wasn't of much help.  So, I looked at
>>>>> the code to try and decipher what they would mean and from what I can
>>>>> tell
>>>>>
>>>>> "DID_NOT_FIND_LOAD_ONLY_MAP_PLAN" means that there is a map plan that
>>>>> meets one of the following conditions
>>>>>
>>>>> 1) has no leaf
>>>>> 2) has more than one leaf
>>>>> 3) has no root
>>>>> 4) has more than one root
>>>>> 5) the leaf is not equivalent to the root
>>>>>
>>>>> "MULTI_LEAF_MAP" apparently means that there are either zero or >=2