-Re: aggregation by time window
Oleg Ruchovets 2013-01-28, 13:51
Hi , Zhiwei.
No :-). Every 7 minutes is is easy. just transform time to
milisecond/7*60000 will give you a bucket key.
I need to do the following:
Find the events which was dirung time T related to the event X.
In very naive approach I need to take first event and find other events
which happend during 7 minutes from first event time. But I think it will
be very slow and I am looking for a way to improve this naive approach.
On Mon, Jan 28, 2013 at 3:09 PM, Zhiwei Lin <[EMAIL PROTECTED]> wrote:
> do you mean every 7 mins?
> e.g, [10:07, 10:14),
> [10:14, 10:21) .....
> On 28 January 2013 12:56, Oleg Ruchovets <[EMAIL PROTECTED]> wrote:
> > Hi ,
> > I have such row data structure:
> > event_id | time
> > =============> > event1 | 10:07
> > event2 | 10:10
> > event3 | 10:12
> > event4 | 10:20
> > event5 | 10:23
> > event6 | 10:25
> > Numbers of records is 50-100 million.
> > Question:
> > I need to get events that was during time T.
> > For example: if T=7 munutes.
> > event1 , event2 , event3 were detected durint 7 minutes.
> > event4 , event5 , event6 were detected during 7 minutes.
> > How can I implement such aggregation using map/reduce.
> > Thanks
> > Oleg.
> Best wishes.