Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - how to model data based on "time bucket"


+
Oleg Ruchovets 2013-01-28, 13:06
+
Rodrigo Ribeiro 2013-01-28, 15:17
+
Oleg Ruchovets 2013-01-28, 15:49
+
Rodrigo Ribeiro 2013-01-28, 16:27
+
Oleg Ruchovets 2013-01-28, 17:07
+
Rodrigo Ribeiro 2013-01-28, 17:24
+
Oleg Ruchovets 2013-01-28, 17:45
+
Oleg Ruchovets 2013-01-30, 09:57
+
Rodrigo Ribeiro 2013-01-30, 18:34
+
Oleg Ruchovets 2013-01-31, 13:52
+
Rodrigo Ribeiro 2013-01-31, 14:34
+
Oleg Ruchovets 2013-01-31, 15:39
+
Rodrigo Ribeiro 2013-01-31, 15:51
Copy link to this message
-
Re: how to model data based on "time bucket"
Michel Segel 2013-01-28, 15:54
Tough one in that if your events are keyed on time alone, you will hit a hot spot on write. Reads,not so much...

TSDB would be a good start ...

You may not need 'buckets' but just a time stamp  and set up a start and stop key values.

Sent from a remote device. Please excuse any typos...

Mike Segel

On Jan 28, 2013, at 7:06 AM, Oleg Ruchovets <[EMAIL PROTECTED]> wrote:

> Hi ,
>
> I have such row data structure:
>
> event_id | time
> ============> event1 | 10:07
> event2 | 10:10
> event3 | 10:12
>
> event4 | 10:20
> event5 | 10:23
> event6 | 10:25
>
>
> Numbers of records is 50-100 million.
>
>
> Question:
>
> I need to find group of events starting form eventX and enters to the time
> window bucket = T.
>
>
> For example: if T=7 munutes.
> Starting from event event1- {event1, event2 , event3} were detected durint
> 7 minutes.
>
> Starting from event event2- {event2 , event3} were detected durint 7
> minutes.
>
> Starting from event event4 - {event4, event5 , event6} were detected during
> 7 minutes.
> Is there a way to model the data in hbase to get?
>
> Thanks
+
Oleg Ruchovets 2013-01-28, 16:24