Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - How to flatten a map?


Copy link to this message
-
Re: How to flatten a map?
Jacob Perkins 2011-02-28, 06:25
Sending to the list as well. Sorry about that.

Is it possible for you to read your data into a databag instead? That
way you don't have to know the keys explicitly and FLATTEN works just
fine. Otherwise you may have to write a simple udf that reads in the map
and returns a bag of tuples. Then you can just flatten the bag like
normal.

--jacob
@thedatachef
On Sun, 2011-02-27 at 22:16 -0800, Bill Graham wrote:
> Hi,
>
> I'd like to be able to flatten a map, so each K->V is flattened into
> it's own row. Basically something like this:
>
> cat map_data.txt
> 32      [123#bill,222#joe]
> 77      [977#mary,987#jane]
> 44      [23#tim,437#maria]
>
> A = LOAD 'map_data.txt' AS (id:int, friends:map[]);
> B = FOREACH A GENERATE id, FLATTEN(friends);
> dump B;
>
> 32      (123,bill)
> 32      (222,joe)
> 77      (977,mary)
> 77      (987,jane)
> 44      (23,tim)
> 44      (437,maria)
>
> The problem though, is that flatten doesn't seem to work on maps and
> instead B is the same as A. The only map-related functionality I've
> been able to find is pulling values by a known key, or returning the
> map size.
>
> Does anyone have any suggestions for how I could do this? Is it
> possible to write/extend/patch something to provide this
> functionality?
>
> thanks,
> Bill