|
|
-
Re: HBase => replication => HiveOtis Gospodnetic 2011-03-11, 19:09
Hi,
> So, you essentially want to dump HBase tables into sequence files/RC > files/text files and read it from Hive? I think that's a Q for J-D. I know that what I had in mind was not about creating periodic dumps because that means data in Hive would always be behind data in HBase, but a more real-time replication a la http://hbase.apache.org/replication.html except with Hive being on the right side of that pretty picture. > How do you plan to handle updates, deletes, IVS etc if you use the log > edits to replicate from hbase to these files? Getting Hive to talk to > HFiles gives you the same problem.. Isn't it easier to take a snapshot > of the table when you actually want to run queries on it? In my prelim The thing is, it looks like there is no way to take a snapshot of a HBase table: http://blog.sematext.com/2011/03/11/hbase-backup-options/ > testing, I did see Hive-HBase full table scans slower than direct Hive > table scans but I don't remember the numbers off hand. This is what made me start this particular thread: http://search-hadoop.com/m/rMdPh9rFlY1 Otis > On Thu, Mar 10, 2011 at 10:43 PM, Otis Gospodnetic > <[EMAIL PROTECTED]> wrote: > > > > Hi, > > > > Since HBase has a mechanism to replicate edit logs to another HBase cluster, >I > > was wondering if people think it would be possible to implement HBase=>Hive > > replication? (and really make the destination pluggable later on) > > > > I'm asking because while one can integrate Hive and HBase by creating >external > > tables in Hive that actually point to tables in HBase, apparently Hive >queries > > run about x5 slower than queries that go against normal Hive tables. > > > > And because all HBase export options are for 1 table at a time and not point >in > > time snapshots of the whole table, exporting data from HBase and importing >into > > Hive doesn't sound like a viable option. > > > > Thanks, > > Otis > > ---- > > Sematext :: http://sematext.com/ :: Solr - Lucene - Hadoop > > Hadoop ecosystem search :: http://search-hadoop.com/ > > > |