-Re: aggregation by time window
Kai Voigt 2013-01-28, 13:17
since each of your events will go into several buckets, you could use map() to emit each item multiple times for each bucket.
Am 28.01.2013 um 13:56 schrieb Oleg Ruchovets <[EMAIL PROTECTED]>:
> Hi ,
> I have such row data structure:
> event_id | time
> =============> event1 | 10:07
> event2 | 10:10
> event3 | 10:12
> event4 | 10:20
> event5 | 10:23
> event6 | 10:25
map(event1,10:07) would emit (10:04,event1), (10:05,event1), ..., (10:10,event1) and so on.
In reduce(), all your desired events would meet for the same minute.