Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Extract a whole table for a given time(stamp)


Copy link to this message
-
Re: Extract a whole table for a given time(stamp)
You can use the Export MR job provided with HBase, it lets you set a time
range: http://hbase.apache.org/book.html#export

J-D
On Mon, May 6, 2013 at 10:27 AM, Gaurav Pandit <[EMAIL PROTECTED]>wrote:

> Hi Hbase users,
>
> We have a use case where we need to know how data looked at a given time in
> past.
>
> The data is stored in HBase of course, with multiple versions. And, the
> goal is to be able to extractall records (rowkey, columns) as of a given
> timestamp, to a file.
>
>
> I am trying to figure out the best way to achieve this.
>
> The options I know are:
> 1. Write a *Java* client using HBase Java API, and scan the hbase table.
> 2. Do the same, but over *Thrift* HBase API using Perl (since
> our environment is mostly Perl).
> 3. Use *Hive *to point to HBase table, and use Sqoop to extract data from
> the Hive table and onto client / RDBMS.
> 4. Use *Pig *to extract data from HBase table and dump it on HDFS and move
> the file over to the client.
>
> So far, I have successfully implemented option (2). I am still running some
> tests to see how it performs, but it works fine as such.
>
> My questions are:
> 1. Is option (3) or (4) even possible? I am not sure if we can access the
> table for a given timestamp over Pig or Hive.
> 2. Is there any other better way of achieving this?
>
>
> Thanks!
> Gaurav
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB