Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Pig Conditionals (Do I have to use UDFs)?


Copy link to this message
-
Re: Pig Conditionals (Do I have to use UDFs)?
Eli Finkelshteyn 2011-09-14, 21:24
Sorry, bad example, I guess. I want something I can do case statements
with. In this case I could map instead, but if I wanted to use less
straight-forward cases (i.e. one case where number == 1, another where
number between 2 and 4, another where number greater than 5, etc...), it
would be much more difficult to do with mapping.

Again, I know this is something I can do with udfs, but it seemed like
something light enough to be built into PIG itself, so I was hoping
there was a way to do it without needing to write a udf every time I
have a new transformation to make.

Eli

On 9/14/11 5:07 PM, Ryan Hoegg wrote:
> What about putting the mappings into their own relation?  I tried this with
> 0.9.0:
>
> example.txt:
> a,1
> a,2
> b,2
> c,1
> d,3
> d,4
>
> mapping.txt:
> 1,one
> 2,two
> 3,three
> 4,four
>
> MAPPINGS = LOAD 'mapping.txt' USING PigStorage(',') AS
> (number:int,name:chararray);
> EXAMPLE_SOURCE = LOAD 'example.txt' USING PigStorage(',') AS
> (item:chararray,number:int);
> MAPPED = JOIN EXAMPLE_SOURCE BY number LEFT OUTER, MAPPINGS BY number;
> PRETTY = FOREACH MAPPED GENERATE item, name;
> DUMP PRETTY;
> (a,one)
> (c,one)
> (a,two)
> (b,two)
> (d,three)
> (d,four)
>
> --
> Ryan Hoegg
>
> On Wed, Sep 14, 2011 at 3:27 PM, Eli Finkelshteyn<[EMAIL PROTECTED]>wrote:
>
>> Hi,
>> I'd like to generate based on exclusive conditions (something like the CASE
>> statement in SQL). An example:
>>
>> Say I have data that looks like:
>>
>> (a, 1)
>> (a, 2)
>> (b, 2)
>> (c, 1)
>> (d, 3)
>> (d, 4)
>>
>> And I want to just convert each of the numbers to their written forms to
>> get:
>>
>> (a, one)
>> (a, two)
>> (b, two)
>> (c, one)
>> (d, three)
>> (d, four)
>>
>> Would I need to write a udf for that, or is there some simple way to do it
>> using cases? I know I can do a bunch of bidirectional generates one on top
>> of the other to achieve this, like:
>>
>> FOREACH rel GENERATE $0, (($1==1) ? 'one' : (($1 == 2) ? 'two' : (($1 == 3)
>> ? 'three' : 'four')));
>>
>> but that seems too messy. I'd appreciate any advice.
>>
>> Thanks!
>> Eli
>>
>>
>>