Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> force schema with TOBAG


Copy link to this message
-
Re: force schema with TOBAG
Cheolsoo

Thank you for the response. This works in 0.10, but not on 0.9.2-amzn. I get an error message that there's an unexpected symbol at or near $ID_NULL (ID_NULL is being replaced, I just thought it would be more clear here)

David

On Oct 30, 2012, at 2:29 PM, Cheolsoo Park <[EMAIL PROTECTED]> wrote:

> Hi David,
>
> Try to *add parentheses*  inside the TOBAG:
>
> normal1 = TOBAG( ('$ID_NULL', 0L) );
>
> or
>
> value1 = ( IsEmpty(relation1) ? TOBAG( ('$ID_NULL', 0L) ) : relation1 );
>
> The reason is because by TOBAG('$ID_NULL', 0L), you mean { ('$ID_NULL'),
> (0) }. But I believe that what you want is { ('$ID_NULL', 0) } given the
> schema of relation 1.
>
> Thanks,
> Cheolsoo
>
> On Tue, Oct 30, 2012 at 10:22 AM, David LaBarbera <
> [EMAIL PROTECTED]> wrote:
>
>> I have a cogroup which effectively does a full outer join of two
>> relations. Some of the relations are blank, so I have a FOREACH statement
>> like
>>
>> grouped = COGROUP relation1 BY x, relation2 BY y;
>> normalized = FOREACH grouped {
>>   normal1 = TOBAG('$ID_NULL', 0L);
>>   value1 = ( IsEmpty(relation1) ? normal1 : relation1 );
>>   GENERATE relation1, relation2;
>> }
>>
>> I get an error on the bincond that left and right schemas don't match. I'm
>> informed that TOBAG return
>> bag{:tuple(:NULL)}
>> for the schema
>> and relation1 is
>> bag{:tuple(id:chararrat,timestamp:long)}
>>
>> I'm running this on EMR which has a modified version of .0.9.2. Any
>> thoughts on how to force TOBAG's schema to match relation1's?
>>
>> David
>>
>>