Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Another schema mixup, was: Re: group schema getting wrong fields?


Copy link to this message
-
Another schema mixup, was: Re: group schema getting wrong fields?
This problem went away in 0.10, but has re-appeared in a slightly different
context in the current trunk.
In this script, I have something like
split a into b,c
d = join b, x;
pd = project d;
e = union pd, c;
split e into f,g
h = project f (this is where I get the incorrect fieldname being used
causing an error. )
On Mon, Aug 27, 2012 at 5:14 PM, Jonathan Coveney <[EMAIL PROTECTED]>wrote:

> Yeah, I think this is a known issue with filters and relations. Use the
> fix, but I think trunk has the fix.
>
> Thanks
>
> 2012/8/24 Lauren Blau <[EMAIL PROTECTED]>
>
> > actually, if I replace the filters that create the original 2 relations
> > with a split, the problem goes away. (i just saw split used in another
> > message and realized I could use it)
> >
> > On Fri, Aug 24, 2012 at 4:11 PM, Lauren Blau <
> > [EMAIL PROTECTED]> wrote:
> >
> > > fcels and fnot are both filtered from the same original relation.
> > >
> > >
> > > On Fri, Aug 24, 2012 at 4:11 PM, Lauren Blau <
> > > [EMAIL PROTECTED]> wrote:
> > >
> > >> how much more. Here's the cxels:
> > >>
> > >> bigcross = join fcels by (chararray)messageId, fnot by (chararray)
> > >> messageId;
> > >> filt1 = filter bigcross by (int)fcels::astart <= (int)fnot::astart;
> > >> filt2 = filter filt1 by (int)fcels::aend >= (int)fnot::aend;
> > >>
> > >> cxels = foreach filt2 generate fcels::messageId as
> > >> messageId:chararray,fcels::astart as celstart:int,fcels::aend as
> > >> celend:int,fnot::alabel as notcellabel:chararray,fnot::astart as
> > >> notcelstart:int, fnot::aend as notcelend:int;
> > >>
> > >>
> > >> On Fri, Aug 24, 2012 at 3:07 PM, Jonathan Coveney <[EMAIL PROTECTED]
> > >wrote:
> > >>
> > >>> Can you post more of your script?
> > >>>
> > >>> 2012/8/24 Lauren Blau <[EMAIL PROTECTED]>
> > >>>
> > >>> > I'm running pig 0.9.2 and seeing this:
> > >>> >
> > >>> > grunt> describe cxels;
> > >>> > cxels: {messageId: chararray,celstart: int,celend: int,notcellabel:
> > >>> > chararray,notcelstart: int,notcelend: int}
> > >>> > grunt> gcxels = group cxels by (messageId,celstart,celend);
> > >>> > grunt> describe gcxels;
> > >>> > gcxels: {group: (messageId: chararray,notcelstart: int,notcelend:
> > >>> > int),cxels: {(messageId: chararray,celstart: int,celend:
> > >>> int,notcellabel:
> > >>> > chararray,notcelstart: int,notcelend: int)}}
> > >>> >
> > >>> >
> > >>> > why does the schema for gcxels::group show notcelstart and
> notcelend
> > >>> when I
> > >>> > gave it celstart,celend as the grouping fields?
> > >>> > Is the fieldname not being matched correctly?
> > >>> >
> > >>> > Thanks,
> > >>> > lauren
> > >>> >
> > >>>
> > >>
> > >>
> > >
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB