Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - conditional and multiple generate inside foreach?


Copy link to this message
-
conditional and multiple generate inside foreach?
Dexin Wang 2011-07-22, 23:42
Possible to do conditional and more than one generate inside a foreach?

for example, I have tuples like this (names, days_ago)

(a,0)
(b,1)
(c,9)
(d,40)

b shows up 1 day ago, so it belongs to all of the following: yesterday, last
week, last month, and last quarter. So I'd like to turn the above to:

(a,0,today)
(b,1,yesterday)
(b,1,week)
(b,1,month)
(b,1,quarter)
(c,9,month)
(c,9,quarter)
(d,40,quarter)

I imagine/dream I could do something like this

B = FOREACH A
  {
        if (days_ago <= 90) generate name,days_ago,'quarter';
        if (days_ago <= 30) generate name,days_ago,'month';
        if (days_ago <= 7)   generate name,days_ago,'week';
        if (days_ago == 1)   generate name,days_ago,'yesterday';
        if (days_ago == 0)   generate name,days_ago,'today';
  }

of course that's not valid syntax. I could write my own UDF but would be
nice there's some way to get what I want without UDF.

Thanks!
Dexin