-Re: Incremental Data Processing With Hive UDAF
buddhika chamith 2013-01-14, 14:46
Any suggestions on this are greatly appreciated. Any one see major road
blocks on this?
On Sat, Jan 12, 2013 at 10:31 AM, buddhika chamith
> Hi All,
> In order to achieve above I am researching on the feasibility of using a
> set of custom UADFs for distributive aggregate operations (e.g: sum, count
> etc..). Idea is to incorporate some state persisted from earlier
> aggregations to the current aggregation value inside merge of the UDAF. For
> distributing state data I was thinking of utilizing Hadoop distributed
> cache. But I am not sure about how exactly UDAF's are executed at runtime.
> Would including the logic to add the persisted state to the current result
> at terminate() ensure that it would be added only once? (Assuming all the
> aggregations fan in at terminate. I may gotten it all wrong here. :)). Or
> is there better way of achieving the same?