Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> aggregation by time window


Copy link to this message
-
Re: aggregation by time window
Hi Kai.
    It is very interesting. Can you please explain in more details your
Idea?
What will be a key in a map phase?

Suppose we have event at 10:07. How would you emit this to the multiple
buckets?

Thanks
Oleg.
On Mon, Jan 28, 2013 at 3:17 PM, Kai Voigt <[EMAIL PROTECTED]> wrote:

> Quick idea:
>
> since each of your events will go into several buckets, you could use
> map() to emit each item multiple times for each bucket.
>
> Am 28.01.2013 um 13:56 schrieb Oleg Ruchovets <[EMAIL PROTECTED]>:
>
> > Hi ,
> >    I have such row data structure:
> >
> > event_id  |   time
> > =============> > event1     |  10:07
> > event2     |  10:10
> > event3     |  10:12
> >
> > event4     |   10:20
> > event5     |   10:23
> > event6     |   10:25
>
> map(event1,10:07) would emit (10:04,event1), (10:05,event1), ...,
> (10:10,event1) and so on.
>
> In reduce(), all your desired events would meet for the same minute.
>
> Kai
>
> --
> Kai Voigt
> [EMAIL PROTECTED]
>
>
>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB