Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Extract a whole table for a given time(stamp)


Copy link to this message
-
Re: Extract a whole table for a given time(stamp)
You can use the Export MR job provided with HBase, it lets you set a time
range: http://hbase.apache.org/book.html#export

J-D
On Mon, May 6, 2013 at 10:27 AM, Gaurav Pandit <[EMAIL PROTECTED]>wrote:

> Hi Hbase users,
>
> We have a use case where we need to know how data looked at a given time in
> past.
>
> The data is stored in HBase of course, with multiple versions. And, the
> goal is to be able to extractall records (rowkey, columns) as of a given
> timestamp, to a file.
>
>
> I am trying to figure out the best way to achieve this.
>
> The options I know are:
> 1. Write a *Java* client using HBase Java API, and scan the hbase table.
> 2. Do the same, but over *Thrift* HBase API using Perl (since
> our environment is mostly Perl).
> 3. Use *Hive *to point to HBase table, and use Sqoop to extract data from
> the Hive table and onto client / RDBMS.
> 4. Use *Pig *to extract data from HBase table and dump it on HDFS and move
> the file over to the client.
>
> So far, I have successfully implemented option (2). I am still running some
> tests to see how it performs, but it works fine as such.
>
> My questions are:
> 1. Is option (3) or (4) even possible? I am not sure if we can access the
> table for a given timestamp over Pig or Hive.
> 2. Is there any other better way of achieving this?
>
>
> Thanks!
> Gaurav
>