Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Temporal in Hbase?


+
Shumin Wu 2012-09-17, 16:28
+
Anoop Sam John 2012-09-18, 03:16
+
Shumin Wu 2012-10-10, 23:24
+
Ramkrishna.S.Vasudevan 2012-10-11, 05:32
+
Anoop Sam John 2012-10-11, 05:10
+
Ramkrishna.S.Vasudevan 2012-10-11, 05:32
Copy link to this message
-
Re: Temporal in Hbase?
Anoop and Ramkrishna,

Your answers combined solved my problem! I tried the approach this morning.
Without making my own customer filter, only 20+ LOC completed my mission!
Thanks for your help!

Anoop: "FYI a FilterList can contain another filter list
So if you have a query like  col1=? AND ( col2=? OR col2=? ) you can use
FilterList.. One inner filter list with MUST_PASS_ONE for col2 and an outer
FL with MUST_PASS_ALL which contains the inner FL and SCVF for col1..
 Hope I understood your problem and giving the answer which you are looking
for   :)"

Ramkrishna: "On the SingleColumnValueFilter we have a property called
setFilterIfMissing().
If the specified value is not found setting this property will filter out
the row.  If you still want the value the property should be false.  Default
is false."

Shumin

On Wed, Oct 10, 2012 at 10:32 PM, Ramkrishna.S.Vasudevan <
[EMAIL PROTECTED]> wrote:

> If your Column doesnot contain the given value means
> If the end_time qualifier is null still the row should be retrieved right?
>
> As far as I read what is temporal database (am not very much familiar.
>  Just
> read thro WIKI to know what is temporal) it is related to multiversioning
> of
> the same row.
> So the same row will have multiple versions.
>
> Suppose
> row_key, col_A, col_B, start_time, end_time
> row1            xxx     yyy     1800            1801
> row1            xxx     yyy     1800            (empty)
>
> As per the versioning if the row1 with empty value for endtime is inserted
> then that will show up first.  Now if your versioning is 1 it will try to
> retrieve the latest value.
> On the SingleColumnValueFilter we have a property called
> setFilterIfMissing().
> If the specified value is not found setting this property will filter out
> the row.  If you still want the value the property should be false.
>  Default
> is false.
>
> So now if my query is start_time>1800 and end_time<1900 with MUSTPASSALL
> then if setFilterIfMissing (false) we can get the latest row which has
> endtime empty.
> Does this answer your question?
>
> Regards
> Ram
>
> > -----Original Message-----
> > From: Shumin Wu [mailto:[EMAIL PROTECTED]]
> > Sent: Thursday, October 11, 2012 4:54 AM
> > To: [EMAIL PROTECTED]
> > Subject: Re: Temporal in Hbase?
> >
> > How I can miss this reply!!
> >
> > Hi Anoop,
> >
> > First, thanks for your reply to my question and apologize for not
> > following
> > up promptly. I have put off a million of fires and come back to this
> > issue.
> > Here are my thoughts. Yes, a FilterList with MUST_PASS_ALL works fine
> > for
> > simple temporal clause.
> >
> > However, I have a use case like this. I need to find all data having
> > overlapping time range for a given time range. Some data are valid till
> > now, which have a open-ended end time timestamp, marked as end_time > > null
> > in our database.
> >
> > To express it formally, for a given time range [const_st, const_end],
> > where
> > const_st represents the constant start time and const_et the constant
> > end
> > time, my task is to find all data rows with start_time and end_time
> > satisfying this expression:
> >
> > start_time < const_et and end_time >= const_st or end_time is null.
> >
> >
> > In a FilterList, I can choose either MUST_PASS_ALL or MUST_PASS_ONE,
> > but
> > none is applicable to this use case.
> >
> > It would be nice if there is a temporal filter that allows me to select
> > data valid between [const_st, const_et] (and that end_time is null will
> > be
> > automatically interpreted as valid up to now).
> >
> > My domain is not traditionally Internet area, but I am sure folks in
> > clickstream business have a similar need. And I am wondering how they
> > solve
> > this problem.
> >
> > Temporal is commonly supported in traditional databases. So maybe HBase
> > can
> > offer the same? I guess the current version does not have this support,
> > and
> > a customer filter needs to be written by myself. I could be wrong.