Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HBase => replication => Hive


Copy link to this message
-
Re: HBase => replication => Hive
Hi,

 > So, you essentially want to dump HBase tables into sequence  files/RC

> files/text files and read it from Hive?

I think that's a Q for J-D.
I know that what I had in mind was not about creating periodic dumps because
that means data in Hive would always be behind data in HBase, but a more
real-time replication a la http://hbase.apache.org/replication.html except with
Hive being on the right side of that pretty picture.

> How do you plan to  handle updates, deletes, IVS etc if you use the log
> edits to replicate from  hbase to these files? Getting Hive to talk to
> HFiles gives you the same  problem.. Isn't it easier to take a snapshot
> of the table when you actually  want to run queries on it? In my prelim

The thing is, it looks like there is no way to take a snapshot of a HBase table:
http://blog.sematext.com/2011/03/11/hbase-backup-options/

> testing, I did see Hive-HBase full  table scans slower than direct Hive
> table scans but I don't remember the  numbers off hand.

This is what made me start this particular thread:
http://search-hadoop.com/m/rMdPh9rFlY1

Otis
> On Thu, Mar 10, 2011 at 10:43 PM, Otis  Gospodnetic
> <[EMAIL PROTECTED]>  wrote:
> >
> > Hi,
> >
> > Since HBase has a mechanism to  replicate edit logs to another HBase cluster,
>I
> > was wondering if people  think it would be possible to implement HBase=>Hive
> > replication? (and  really make the destination pluggable later on)
> >
> > I'm asking  because while one can integrate Hive and HBase by creating
>external
> >  tables in Hive that actually point to tables in HBase, apparently Hive  
>queries
> > run about x5 slower than queries that go against normal Hive  tables.
> >
> > And because all HBase export options are for 1 table at  a time and not point
>in
> > time snapshots of the whole table, exporting  data from HBase and importing
>into
> > Hive doesn't sound like a viable  option.
> >
> > Thanks,
> > Otis
> > ----
> > Sematext :: http://sematext.com/ :: Solr -  Lucene - Hadoop
> > Hadoop ecosystem search :: http://search-hadoop.com/
> >
>