-manipulating HBaseStorage map outside of a UDF?
Norbert Burger 2011-09-01, 01:55
I'm using HBaseStorage to load a large column family (many columns)
into a relation, which generates a map on each row. The maps are
wide and sparse (only a few keys exist on each row), and I'd ideally
like to GROUP all maps together by similar columns before passing off
to a UDF for further processing.
Is this possible? I'd be fine with converting to bags first, but
seems TOBAG() just adds the extra bagging layer on top of a map.
Failing that, is there any manipulation I can make on these types of
relations in Pig in the case where I don't want to explicitly specify
each map key?