|
|
-
Re: MULTI_LEAF_MAP and DID_NOT_FIND_LOAD_ONLY_MAP_PLANAshutosh Chauhan 2010-04-07, 23:27
Hi Eric,
What you wrote is mostly true. A distilled down script generating same warnings will be helpful. Thing is it seems that invariants which should always hold (in theory) on generated map-reduce plan from query are getting violated. I would like to see the script which can make this happen. This might be a bug. Ashutosh On Wed, Mar 31, 2010 at 13:36, Eric Tschetter <[EMAIL PROTECTED]> wrote: > Ok, so if I'm understanding you correctly, the warnings are basically > Pig saying that it cannot apply a specific optimization, is that > correct? If so, then might I make a suggestion that they not show up > as "WARNING" but as "INFO" or something. Also, a slightly more > meaningful message (something like, "Optimization XX was not able to > be applied because of YY") might be helpful. It also sounds like I > can safely ignore them as they won't affect the meaning of the pig > script, just how it is actually executed. > > If this is wrong, I can try to distill down the script and get the > same warnings. > > --Eric > > > > On Tue, Mar 30, 2010 at 10:37 PM, Ashutosh Chauhan > <[EMAIL PROTECTED]> wrote: >> After Pig compiles a query into a series of map-reduce plans, it once >> again iterate through those jobs trying to spot the opportunities of >> further optimizations. While visiting such compiled plan, there are >> certain invariants which must always hold. If Pig finds contrary to >> it, it backs out and doesn't try to apply that optimization step. It >> seems that you are running into that situation. >> How this affects computation of your query I am not sure. Can you >> paste stripped down version of your query which generates these >> warning messages? >> >> Ashutosh >> >> On Fri, Mar 26, 2010 at 10:55, Eric Tschetter <[EMAIL PROTECTED]> wrote: >>> I'm writing a fairly involved pig script to do some data munging and >>> after all was said and done, I end up getting the warnings mentioned >>> in the subject. >>> >>> The grunt output is actually: >>> >>> 2010-03-26 09:42:33,245 [main] WARN >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>> - Encountered Warning DID_NOT_FIND_LOAD_ONLY_MAP_PLAN 2 time(s). >>> 2010-03-26 09:42:33,250 [main] WARN >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>> - Encountered Warning MULTI_LEAF_MAP 1 time(s). >>> >>> This is on pig 0.6.0. I tried to find a document that can explain >>> what these mean, but St. Google wasn't of much help. So, I looked at >>> the code to try and decipher what they would mean and from what I can >>> tell >>> >>> "DID_NOT_FIND_LOAD_ONLY_MAP_PLAN" means that there is a map plan that >>> meets one of the following conditions >>> >>> 1) has no leaf >>> 2) has more than one leaf >>> 3) has no root >>> 4) has more than one root >>> 5) the leaf is not equivalent to the root >>> >>> "MULTI_LEAF_MAP" apparently means that there are either zero or >=2 >>> leaves on a map plan. >>> >>> So, I'm guessing that the DID_NOT_FIND_LOAD_ONLY_MAP_PLAN warnings are >>> coming from having 2 leaves. Unfortunately, I have no clue what that >>> means. Can anyone shed some light on what these warnings mean, if I >>> need to care about them and what I should look for to get rid of them >>> (if they are, in fact, not good)? >>> >>> Also, if the meaning of these warnings is documented somewhere and >>> Google just couldn't find it, please point me to the documentation. >>> >>> --Eric Tschetter >>> >> > |