Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> MULTI_LEAF_MAP and DID_NOT_FIND_LOAD_ONLY_MAP_PLAN


Copy link to this message
-
Re: MULTI_LEAF_MAP and DID_NOT_FIND_LOAD_ONLY_MAP_PLAN
Hi Eric,

What you wrote is mostly true. A distilled down script generating same
warnings will be helpful. Thing is it seems that invariants which
should always hold (in theory) on generated map-reduce plan from query
are getting violated. I would like to see the script which can make
this happen. This might be a bug.

Ashutosh

On Wed, Mar 31, 2010 at 13:36, Eric Tschetter <[EMAIL PROTECTED]> wrote:
> Ok, so if I'm understanding you correctly, the warnings are basically
> Pig saying that it cannot apply a specific optimization, is that
> correct?  If so, then might I make a suggestion that they not show up
> as "WARNING" but as "INFO" or something.  Also, a slightly more
> meaningful message (something like, "Optimization XX was not able to
> be applied because of YY") might be helpful.  It also sounds like I
> can safely ignore them as they won't affect the meaning of the pig
> script, just how it is actually executed.
>
> If this is wrong, I can try to distill down the script and get the
> same warnings.
>
> --Eric
>
>
>
> On Tue, Mar 30, 2010 at 10:37 PM, Ashutosh Chauhan
> <[EMAIL PROTECTED]> wrote:
>> After Pig compiles a query into a series of map-reduce plans, it once
>> again iterate through those jobs trying to spot the opportunities of
>> further optimizations. While visiting such compiled plan, there are
>> certain invariants which must always hold. If Pig finds contrary to
>> it, it backs out and doesn't try to apply that optimization step. It
>> seems that you are running into that situation.
>> How this affects computation of your query I am not sure. Can you
>> paste stripped down version of your query which generates these
>> warning messages?
>>
>> Ashutosh
>>
>> On Fri, Mar 26, 2010 at 10:55, Eric Tschetter <[EMAIL PROTECTED]> wrote:
>>> I'm writing a fairly involved pig script to do some data munging and
>>> after all was said and done, I end up getting the warnings mentioned
>>> in the subject.
>>>
>>> The grunt output is actually:
>>>
>>> 2010-03-26 09:42:33,245 [main] WARN
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>>> - Encountered Warning DID_NOT_FIND_LOAD_ONLY_MAP_PLAN 2 time(s).
>>> 2010-03-26 09:42:33,250 [main] WARN
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>>> - Encountered Warning MULTI_LEAF_MAP 1 time(s).
>>>
>>> This is on pig 0.6.0.  I tried to find a document that can explain
>>> what these mean, but St. Google wasn't of much help.  So, I looked at
>>> the code to try and decipher what they would mean and from what I can
>>> tell
>>>
>>> "DID_NOT_FIND_LOAD_ONLY_MAP_PLAN" means that there is a map plan that
>>> meets one of the following conditions
>>>
>>> 1) has no leaf
>>> 2) has more than one leaf
>>> 3) has no root
>>> 4) has more than one root
>>> 5) the leaf is not equivalent to the root
>>>
>>> "MULTI_LEAF_MAP" apparently means that there are either zero or >=2
>>> leaves on a map plan.
>>>
>>> So, I'm guessing that the DID_NOT_FIND_LOAD_ONLY_MAP_PLAN warnings are
>>> coming from having 2 leaves.  Unfortunately, I have no clue what that
>>> means.  Can anyone shed some light on what these warnings mean, if I
>>> need to care about them and what I should look for to get rid of them
>>> (if they are, in fact, not good)?
>>>
>>> Also, if the meaning of these warnings is documented somewhere and
>>> Google just couldn't find it, please point me to the documentation.
>>>
>>> --Eric Tschetter
>>>
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB