Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> how to model data based on "time bucket"


Copy link to this message
-
Re: how to model data based on "time bucket"
Hi Rodrigo.
  Can you please explain in more details your solution.You said that I will
have another table. How many table will I have? Will I have 2 tables? What
will be the schema of the tables?

I try to explain what I try to achive:
    I have ~50 million records like {time|event}. I want to put the data in
Hbase in such way :
    events of time X and all events what was after event X during time
T minutes (for example during 7 minutes).
So I will be able to scan all table and get groups like:

  {event1:10:02} corresponds to events {event2:10:03} , {event3:10:05} ,
{event4:10:06}
  {event2:10:30} correnponds to events {events5:10:32} , {event3:10:33} ,
{event3:10:36}.

Thanks
Oleg.
On Mon, Jan 28, 2013 at 5:17 PM, Rodrigo Ribeiro <
[EMAIL PROTECTED]> wrote:

> You can use another table as a index, using a rowkey like
> '{time}:{event_id}', and then scan in the range ["10:07", "10:15").
>
> On Mon, Jan 28, 2013 at 10:06 AM, Oleg Ruchovets <[EMAIL PROTECTED]
> >wrote:
>
> > Hi ,
> >
> > I have such row data structure:
> >
> > event_id | time
> > ============> > event1 | 10:07
> > event2 | 10:10
> > event3 | 10:12
> >
> > event4 | 10:20
> > event5 | 10:23
> > event6 | 10:25
> >
> >
> > Numbers of records is 50-100 million.
> >
> >
> > Question:
> >
> > I need to find group of events starting form eventX and enters to the
> time
> > window bucket = T.
> >
> >
> > For example: if T=7 munutes.
> > Starting from event event1- {event1, event2 , event3} were detected
> durint
> > 7 minutes.
> >
> > Starting from event event2- {event2 , event3} were detected durint 7
> > minutes.
> >
> > Starting from event event4 - {event4, event5 , event6} were detected
> during
> > 7 minutes.
> > Is there a way to model the data in hbase to get?
> >
> > Thanks
> >
>
>
>
> --
>
> *Rodrigo Pereira Ribeiro*
> Software Developer
> www.jusbrasil.com.br
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB