Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Incremental Data Processing With Hive UDAF

Copy link to this message
Re: Incremental Data Processing With Hive UDAF
Any suggestions on this are greatly appreciated. Any one see major road
blocks on this?


On Sat, Jan 12, 2013 at 10:31 AM, buddhika chamith

> Hi All,
> In order to achieve above I am researching on the feasibility of using a
> set of custom UADFs for distributive aggregate operations (e.g: sum, count
> etc..). Idea is to incorporate some state persisted from earlier
> aggregations to the current aggregation value inside merge of the UDAF. For
> distributing state data I was thinking of utilizing Hadoop distributed
> cache. But I am not sure about how exactly UDAF's are executed at runtime.
> Would including the logic to add the persisted state to the current result
> at terminate() ensure that it would be added only once? (Assuming all the
> aggregations fan in at terminate. I may gotten it all wrong here. :)). Or
> is there better way of achieving the same?
> Regards
> Buddhika