Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - force schema with TOBAG


+
David LaBarbera 2012-10-30, 17:22
+
Cheolsoo Park 2012-10-30, 18:29
+
David LaBarbera 2012-10-31, 11:53
+
Cheolsoo Park 2012-10-31, 16:23
+
David LaBarbera 2012-10-31, 16:54
Copy link to this message
-
Re: force schema with TOBAG
Cheolsoo Park 2012-10-31, 17:05
Great! Thanks!

On Wed, Oct 31, 2012 at 9:54 AM, David LaBarbera <
[EMAIL PROTECTED]> wrote:

> Cheolsoo
>
> That works. Thanks so much for the help.
> And congratulations on your new committer status!
>
> David
>
> On Oct 31, 2012, at 12:23 PM, Cheolsoo Park <[EMAIL PROTECTED]> wrote:
>
> > Hi David,
> >
> > How about "TOBAG( TOTUPLE( $ID_NULL, 0L ) )" ? The "( )" is just
> > a syntactical sugar for "TOTUPLE( )" that was introduced in 0.10. (Sorry
> > that I forgot that "( )" doesn't work in 0.9.)
> >
> > Thanks,
> > Cheolsoo
> >
> > On Wed, Oct 31, 2012 at 4:53 AM, David LaBarbera <
> > [EMAIL PROTECTED]> wrote:
> >
> >> Cheolsoo
> >>
> >> Thank you for the response. This works in 0.10, but not on 0.9.2-amzn. I
> >> get an error message that there's an unexpected symbol at or near
> $ID_NULL
> >> (ID_NULL is being replaced, I just thought it would be more clear here)
> >>
> >> David
> >>
> >> On Oct 30, 2012, at 2:29 PM, Cheolsoo Park <[EMAIL PROTECTED]>
> wrote:
> >>
> >>> Hi David,
> >>>
> >>> Try to *add parentheses*  inside the TOBAG:
> >>>
> >>> normal1 = TOBAG( ('$ID_NULL', 0L) );
> >>>
> >>> or
> >>>
> >>> value1 = ( IsEmpty(relation1) ? TOBAG( ('$ID_NULL', 0L) ) : relation1
> );
> >>>
> >>> The reason is because by TOBAG('$ID_NULL', 0L), you mean {
> ('$ID_NULL'),
> >>> (0) }. But I believe that what you want is { ('$ID_NULL', 0) } given
> the
> >>> schema of relation 1.
> >>>
> >>> Thanks,
> >>> Cheolsoo
> >>>
> >>> On Tue, Oct 30, 2012 at 10:22 AM, David LaBarbera <
> >>> [EMAIL PROTECTED]> wrote:
> >>>
> >>>> I have a cogroup which effectively does a full outer join of two
> >>>> relations. Some of the relations are blank, so I have a FOREACH
> >> statement
> >>>> like
> >>>>
> >>>> grouped = COGROUP relation1 BY x, relation2 BY y;
> >>>> normalized = FOREACH grouped {
> >>>>  normal1 = TOBAG('$ID_NULL', 0L);
> >>>>  value1 = ( IsEmpty(relation1) ? normal1 : relation1 );
> >>>>  GENERATE relation1, relation2;
> >>>> }
> >>>>
> >>>> I get an error on the bincond that left and right schemas don't match.
> >> I'm
> >>>> informed that TOBAG return
> >>>> bag{:tuple(:NULL)}
> >>>> for the schema
> >>>> and relation1 is
> >>>> bag{:tuple(id:chararrat,timestamp:long)}
> >>>>
> >>>> I'm running this on EMR which has a modified version of .0.9.2. Any
> >>>> thoughts on how to force TOBAG's schema to match relation1's?
> >>>>
> >>>> David
> >>>>
> >>>>
> >>
> >>
>
>