Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Incremental Data Processing With Hive UDAF


Copy link to this message
-
Re: Incremental Data Processing With Hive UDAF
buddhika chamith 2013-01-14, 14:46
Any suggestions on this are greatly appreciated. Any one see major road
blocks on this?

Regards
Buddhika

On Sat, Jan 12, 2013 at 10:31 AM, buddhika chamith
<[EMAIL PROTECTED]>wrote:

> Hi All,
>
> In order to achieve above I am researching on the feasibility of using a
> set of custom UADFs for distributive aggregate operations (e.g: sum, count
> etc..). Idea is to incorporate some state persisted from earlier
> aggregations to the current aggregation value inside merge of the UDAF. For
> distributing state data I was thinking of utilizing Hadoop distributed
> cache. But I am not sure about how exactly UDAF's are executed at runtime.
> Would including the logic to add the persisted state to the current result
> at terminate() ensure that it would be added only once? (Assuming all the
> aggregations fan in at terminate. I may gotten it all wrong here. :)). Or
> is there better way of achieving the same?
>
> Regards
> Buddhika
>