We decided to use setTimeRange and setMaxVersions, and remove the column
with a reference timestamp (i.e. we don't put this column into hbase
anymore). This behavior is what we would like but it seems very inefficient
because all versions are processed before the setMaxVersions takes effect
(I just posted some new findings in another post).
On Mon, Aug 20, 2012 at 4:47 PM, Alex Baranau <[EMAIL PROTECTED]>wrote:
> So, you have row with key rowKeyA and column col1. And it contains two
> values value1 and value2 at timestamp1 and timestamp2 respectively, where
> timestamp1 is most recent. And you want to fetch "most recent but one"
> values in all columns when doing the scan. I.e. you don't know the
> timestamp1 or timestamp2 exactly you just need to fetch the value which was
> placed before the most recent one. Is that correct?
> Don't think there's some filter that would allow you to do so
> "out-of-the-box". You should probably be able to write such filter and use
> scan.setMaxVersions(2). Not sure if keyvalues are fed into filter ordered
> by their timestamp..
> How about returning 2 most recent values to the client and filtering on the
> client-side? Why this doesn't work in your case? (large values in columns
> in size or?).
> Alex Baranau
> Sematext :: http://blog.sematext.com/ :: Hadoop - HBase - ElasticSearch -
> On Mon, Aug 20, 2012 at 2:57 PM, Jerry Lam <[EMAIL PROTECTED]> wrote:
> > Hi HBase community:
> > I have a requirement in which I need to query a row based on the
> > stored in the value of a column of a row. For example.
> > (rowkeyA of col1) -> (value) at timestamp = t1, (value) stores t2. Result
> > should return all columns of rowkeyA at timestamp = t2.
> > Note that t1 > t2 ALWAYS.
> > Can this sound like something that can be done using Filter? If yes, can
> > be done using the existing filters in HBase without customization?
> > Best Regards,
> > Jerry