Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Compare range of numbers on column family

Copy link to this message
Re: Compare range of numbers on column family
anil gupta 2012-04-20, 20:21
Hi Akbar,

In order to do numerical comparison first you will need to store the
numberical comparsion data as a Number rather than a String. For storing
numerical data you will need to write a custom mapper if you are using
HBase bulk loading.
Once you have store the data as number rather Strings then you will need to
use the BinaryComparator.
Hope this Helps


On Fri, Apr 20, 2012 at 3:57 AM, Bijieshan <[EMAIL PROTECTED]> wrote:

> Akbar,
> I think you need to customize a comparator yourself. You can't get the
> results you want by using BinaryComparator.
> Hope I get you correctly.
> Jieshan.
> -----Original Message-----
> From: Akbar Gadhiya [mailto:[EMAIL PROTECTED]]
> Sent: Friday, April 20, 2012 6:19 PM
> Subject: Compare range of numbers on column family
> Hello,
> I need help in scanning data with column family value.
> With this sample data and scan command, first scan command returns nothing
> and second returns row containing 6000.
> PK.john.20120422 column=alternateKey:ms, timestamp=1334912415796,
> value=6000
> My use case is to scan records which falls between start and end timestamp.
> (timestamp is stored in column family alternateKey:ms)
> We can not use timestamp provided by hbase because it indicates time when
> record is inserted to hbase but we require timestamp related to business
> needs.
> We are trying to compare number as opposed to lexical comparison.  Is there
> any way I can perform this scan operation?
> My data and scan command look like,
> create 'demo', 'user', 'alternateKey', 'content'
> put 'innar_demo', 'PK.innar.20120418', 'user', 'Innar'
> put 'innar_demo', 'PK.innar.20120418', 'alternateKey:city', 'Tallinn'
> put 'innar_demo', 'PK.innar.20120418', 'alternateKey:phone', '0001'
> put 'innar_demo', 'PK.innar.20120418', 'alternateKey:ms', '1000'
> put 'innar_demo', 'PK.innar.20120418', 'content', 'Innar_GPB'
> put 'innar_demo', 'PK.akbar.20120418', 'user', 'Akbar'
> put 'innar_demo', 'PK.akbar.20120418', 'alternateKey:city', 'Ahmedabad'
> put 'innar_demo', 'PK.akbar.20120418', 'alternateKey:phone', '0002'
> put 'innar_demo', 'PK.akbar.20120418', 'alternateKey:ms', '2000'
> put 'innar_demo', 'PK.akbar.20120418', 'content', 'Akbar_GPB'
> put 'innar_demo', 'PK.ell.20120419', 'user', 'Ell'
> put 'innar_demo', 'PK.ell.20120419', 'alternateKey:city', 'Bangkok'
> put 'innar_demo', 'PK.ell.20120419', 'alternateKey:phone', '0003'
> put 'innar_demo', 'PK.ell.20120419', 'alternateKey:ms', '3000'
> put 'innar_demo', 'PK.ell.20120419', 'content', 'Ell_GPB'
> put 'innar_demo', 'PK.jane.20120420', 'user', 'Jane'
> put 'innar_demo', 'PK.jane.20120420', 'alternateKey:city', 'Jersey City'
> put 'innar_demo', 'PK.jane.20120420', 'alternateKey:phone', '0004'
> put 'innar_demo', 'PK.jane.20120420', 'alternateKey:ms', '4000'
> put 'innar_demo', 'PK.jane.20120420', 'content', 'Jane_GPB'
> put 'innar_demo', 'PK.michael.20120421', 'user', 'Michael'
> put 'innar_demo', 'PK.michael.20120421', 'alternateKey:city', 'Berlin'
> put 'innar_demo', 'PK.michael.20120421', 'alternateKey:phone', '0005'
> put 'innar_demo', 'PK.michael.20120421', 'alternateKey:ms', '5000'
> put 'innar_demo', 'PK.michael.20120421', 'content', 'Michael_GPB'
> put 'innar_demo', 'PK.john.20120422', 'user', 'John'
> put 'innar_demo', 'PK.john.20120422', 'alternateKey:city', 'London'
> put 'innar_demo', 'PK.john.20120422', 'alternateKey:phone', '0006'
> put 'innar_demo', 'PK.john.20120422', 'alternateKey:ms', '6000'
> put 'innar_demo', 'PK.john.20120422', 'content', 'John_GPB'
> import org.apache.hadoop.hbase.filter.FilterList
> import org.apache.hadoop.hbase.filter.FilterList::Operator
> import org.apache.hadoop.hbase.filter.CompareFilter
> import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
> import org.apache.hadoop.hbase.filter.SubstringComparator
> import org.apache.hadoop.hbase.filter.BinaryComparator
> import org.apache.hadoop.hbase.util.Bytes
> import org.apache.hadoop.hbase.filter.ColumnRangeFilter

Thanks & Regards,
Anil Gupta