Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - aggregation by time window


Copy link to this message
-
Re: aggregation by time window
Oleg Ruchovets 2013-01-28, 13:43
Hi Kai.
    It is very interesting. Can you please explain in more details your
Idea?
What will be a key in a map phase?

Suppose we have event at 10:07. How would you emit this to the multiple
buckets?

Thanks
Oleg.
On Mon, Jan 28, 2013 at 3:17 PM, Kai Voigt <[EMAIL PROTECTED]> wrote:

> Quick idea:
>
> since each of your events will go into several buckets, you could use
> map() to emit each item multiple times for each bucket.
>
> Am 28.01.2013 um 13:56 schrieb Oleg Ruchovets <[EMAIL PROTECTED]>:
>
> > Hi ,
> >    I have such row data structure:
> >
> > event_id  |   time
> > =============> > event1     |  10:07
> > event2     |  10:10
> > event3     |  10:12
> >
> > event4     |   10:20
> > event5     |   10:23
> > event6     |   10:25
>
> map(event1,10:07) would emit (10:04,event1), (10:05,event1), ...,
> (10:10,event1) and so on.
>
> In reduce(), all your desired events would meet for the same minute.
>
> Kai
>
> --
> Kai Voigt
> [EMAIL PROTECTED]
>
>
>
>
>