Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> aggregation by time window


Copy link to this message
-
Re: aggregation by time window
Hi , Zhiwei.
    No :-). Every 7 minutes is is easy. just transform time to
milisecond/7*60000 will give you a bucket key.

I need to do the following:
    Find the events which was dirung time T related to the event X.

In very naive approach I need to take first event and find other events
which happend during 7 minutes from first event time. But I think it will
be very slow and I am looking for a way to improve this naive approach.

Thanks
Oleg.

On Mon, Jan 28, 2013 at 3:09 PM, Zhiwei Lin <[EMAIL PROTECTED]> wrote:

> do you mean every 7 mins?
> e.g, [10:07, 10:14),
>        [10:14, 10:21) .....
>
> On 28 January 2013 12:56, Oleg Ruchovets <[EMAIL PROTECTED]> wrote:
>
> > Hi ,
> >     I have such row data structure:
> >
> > event_id  |   time
> > =============> > event1     |  10:07
> > event2     |  10:10
> > event3     |  10:12
> >
> > event4     |   10:20
> > event5     |   10:23
> > event6     |   10:25
> >
> > Numbers of records is  50-100 million.
> >
> > Question:
> >    I need to get events that was during time T.
> >
> > For example: if T=7 munutes.
> >      event1 , event2 , event3 were detected durint 7 minutes.
> >      event4 , event5 , event6 were detected during 7 minutes.
> >
> > How can I implement such aggregation using map/reduce.
> >
> > Thanks
> > Oleg.
> >
>
>
>
> --
>
> Best wishes.
>
> Zhiwei
>