This is a typical problem statement for data integration. You need to create centralize repository of data coming from different data sources. This centralized data repository (warehouse) will have data refreshed incrementally. This incremental refresh will assure you up-to-date data from all data sources. Once this repository build then you can write aggregates on this data. Sqoop can play some role here. But mostly it will be ETL operations and you can live with any ETL tool or pig. Any specific reason of using
Sent from HTC via Rocket.
----- Reply message -----
From: "shengjie min" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Subject: ETL like merge databases to HBase
Date: Mon, Aug 5, 2013 6:24 AM
-Actually, it might be easier to go with a pure RDBMS solution here since nowadays the Slave/master architectures in postgre and MySQL are mature enough to handle this sort of thing even for hundreds of thousands of rows.
Let's assume RDBMS are from Customer's applications, I don't have that much grip on them and I don't want to mess around their environments that much too.
On 2 Aug 2013, at 10:17, Jay Vyas <[EMAIL PROTECTED]> wrote:
> Hbase doesn't have dynamic views on data outside of itself. But you can easily re run your sqoop flow to dump information into hbase.
> Actually, it might be easier to go with a pure RDBMS solution here since nowadays the Slave/master architectures in postgre and MySQL are mature enough to handle this sort of thing even for hundreds of thousands of rows.