Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> aggregation by time window


+
Oleg Ruchovets 2013-01-28, 12:56
+
Kai Voigt 2013-01-28, 13:17
+
Oleg Ruchovets 2013-01-28, 13:43
+
Kai Voigt 2013-01-28, 13:48
+
Oleg Ruchovets 2013-01-28, 14:49
+
Zhiwei Lin 2013-01-28, 13:09
Copy link to this message
-
Re: aggregation by time window
Hi , Zhiwei.
    No :-). Every 7 minutes is is easy. just transform time to
milisecond/7*60000 will give you a bucket key.

I need to do the following:
    Find the events which was dirung time T related to the event X.

In very naive approach I need to take first event and find other events
which happend during 7 minutes from first event time. But I think it will
be very slow and I am looking for a way to improve this naive approach.

Thanks
Oleg.

On Mon, Jan 28, 2013 at 3:09 PM, Zhiwei Lin <[EMAIL PROTECTED]> wrote:

> do you mean every 7 mins?
> e.g, [10:07, 10:14),
>        [10:14, 10:21) .....
>
> On 28 January 2013 12:56, Oleg Ruchovets <[EMAIL PROTECTED]> wrote:
>
> > Hi ,
> >     I have such row data structure:
> >
> > event_id  |   time
> > =============> > event1     |  10:07
> > event2     |  10:10
> > event3     |  10:12
> >
> > event4     |   10:20
> > event5     |   10:23
> > event6     |   10:25
> >
> > Numbers of records is  50-100 million.
> >
> > Question:
> >    I need to get events that was during time T.
> >
> > For example: if T=7 munutes.
> >      event1 , event2 , event3 were detected durint 7 minutes.
> >      event4 , event5 , event6 were detected during 7 minutes.
> >
> > How can I implement such aggregation using map/reduce.
> >
> > Thanks
> > Oleg.
> >
>
>
>
> --
>
> Best wishes.
>
> Zhiwei
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB