Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Joining inner and outer bags

Copy link to this message
Joining inner and outer bags
Kris Coward 2011-01-07, 17:20

I've got an outer bag/relation consistig of a bunch of user information,
one of the pieces of which is an inner bag of possible events for that
user, and the value of those events, should they occur. Outside the bag,
there are also a few data concerning whether specific events have
already occurred.

In another relation, I have the assortment of events grouped with the
probability that any of them will occur.

I'd like to generate expected values for each user, but know that I
can't JOIN within a FOREACH block (or do a nested FOREACH). For a UDF,
I vaguely recall some sort of constraint on nesting inner bags that
would interfere with my ability to bundle the possible events bag with
the actual events data into a single object that could be passed to a
UDF that extends EvalFunc.

Am I misremembering something? Is there some other sort of clever
trickery that I might be able to use to generate expected values if I'm
not? (and if I am, is there something less hackish than a GROUP on a
unique tuple element that I could use to load the desired values into a
bag or tuple (or just plain pass the entire tuple to a UDF)?


Kris Coward http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3