|
|
-
Another schema mixup, was: Re: group schema getting wrong fields?Lauren Blau 2012-10-14, 01:44
This problem went away in 0.10, but has re-appeared in a slightly different
context in the current trunk. In this script, I have something like split a into b,c d = join b, x; pd = project d; e = union pd, c; split e into f,g h = project f (this is where I get the incorrect fieldname being used causing an error. ) On Mon, Aug 27, 2012 at 5:14 PM, Jonathan Coveney <[EMAIL PROTECTED]>wrote: > Yeah, I think this is a known issue with filters and relations. Use the > fix, but I think trunk has the fix. > > Thanks > > 2012/8/24 Lauren Blau <[EMAIL PROTECTED]> > > > actually, if I replace the filters that create the original 2 relations > > with a split, the problem goes away. (i just saw split used in another > > message and realized I could use it) > > > > On Fri, Aug 24, 2012 at 4:11 PM, Lauren Blau < > > [EMAIL PROTECTED]> wrote: > > > > > fcels and fnot are both filtered from the same original relation. > > > > > > > > > On Fri, Aug 24, 2012 at 4:11 PM, Lauren Blau < > > > [EMAIL PROTECTED]> wrote: > > > > > >> how much more. Here's the cxels: > > >> > > >> bigcross = join fcels by (chararray)messageId, fnot by (chararray) > > >> messageId; > > >> filt1 = filter bigcross by (int)fcels::astart <= (int)fnot::astart; > > >> filt2 = filter filt1 by (int)fcels::aend >= (int)fnot::aend; > > >> > > >> cxels = foreach filt2 generate fcels::messageId as > > >> messageId:chararray,fcels::astart as celstart:int,fcels::aend as > > >> celend:int,fnot::alabel as notcellabel:chararray,fnot::astart as > > >> notcelstart:int, fnot::aend as notcelend:int; > > >> > > >> > > >> On Fri, Aug 24, 2012 at 3:07 PM, Jonathan Coveney <[EMAIL PROTECTED] > > >wrote: > > >> > > >>> Can you post more of your script? > > >>> > > >>> 2012/8/24 Lauren Blau <[EMAIL PROTECTED]> > > >>> > > >>> > I'm running pig 0.9.2 and seeing this: > > >>> > > > >>> > grunt> describe cxels; > > >>> > cxels: {messageId: chararray,celstart: int,celend: int,notcellabel: > > >>> > chararray,notcelstart: int,notcelend: int} > > >>> > grunt> gcxels = group cxels by (messageId,celstart,celend); > > >>> > grunt> describe gcxels; > > >>> > gcxels: {group: (messageId: chararray,notcelstart: int,notcelend: > > >>> > int),cxels: {(messageId: chararray,celstart: int,celend: > > >>> int,notcellabel: > > >>> > chararray,notcelstart: int,notcelend: int)}} > > >>> > > > >>> > > > >>> > why does the schema for gcxels::group show notcelstart and > notcelend > > >>> when I > > >>> > gave it celstart,celend as the grouping fields? > > >>> > Is the fieldname not being matched correctly? > > >>> > > > >>> > Thanks, > > >>> > lauren > > >>> > > > >>> > > >> > > >> > > > > > > |