Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> ETL like merge databases to HBase


+
Shengjie Min 2013-08-01, 15:04
+
Ted Yu 2013-08-01, 16:13
+
Shengjie Min 2013-08-02, 01:23
+
Ted Yu 2013-08-02, 01:27
+
Jay Vyas 2013-08-02, 02:17
+
Shahab Yunus 2013-08-02, 03:07
+
shengjie min 2013-08-03, 20:03
+
shengjie min 2013-08-05, 00:54
Copy link to this message
-
Re: ETL like merge databases to HBase
Shengjie,

This is a typical problem statement for data integration. You need to create centralize repository of data coming from different data sources. This centralized data repository (warehouse) will have data refreshed incrementally. This incremental refresh will assure you up-to-date data from all data sources. Once this repository build then you can write aggregates on this data. Sqoop can play some role here. But mostly it will be ETL operations and you can live with any ETL tool or pig.   Any specific reason of using
Hbase here?
Sent from HTC via Rocket.

----- Reply message -----
From: "shengjie min" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Subject: ETL like merge databases to HBase
Date: Mon, Aug 5, 2013 6:24 AM
-Actually, it might be easier to go with a pure RDBMS solution here since nowadays the Slave/master architectures in postgre and MySQL are mature enough to handle this sort of thing even for hundreds of thousands of rows.

Let's assume RDBMS are from Customer's applications, I don't have that much grip on them and I don't want to mess around their environments that much too.

Shengjie

On 2 Aug 2013, at 10:17, Jay Vyas <[EMAIL PROTECTED]> wrote:

> Hbase doesn't have dynamic views on data outside of itself. But you can easily re run your sqoop flow to dump information into hbase.
>
> Actually, it might be easier to go with a pure RDBMS solution here since nowadays the Slave/master architectures in postgre and MySQL are mature enough to handle this sort of thing even for hundreds of thousands of rows.