Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> How to solve one Scenario in hadoop ?


Copy link to this message
-
How to solve one Scenario in hadoop ?
Hi All,
   I have  one scenario  where our organization is trying to implement
hadoop.

Scenario Statement:

---------------------------------------

    Supoose  we have variouse data sources , for example RDBMS, HDFS,
Streaming .
*Source Dataset Types :*

 1.Single Source

2.Joining Sources

3.Filtered Data set

4.Specific columns
We nee to pull the data from one source to other , it could be from HDFS to
RDBMS or vice versa based on condition , that means out of whole data from
source  we need only the specific data,whole data,join data  into the
destination . So which direction we should go to pull the data based on the
above dataset type condition.
I am thinking .

 CASE-1   DATA  from HDFS to HDFS (different cluster) whole data
           :-  we will use *distcp  *

CASE-2    DATA  from HDFS to HDFS (different cluster) conditional data
(Filter data) :-  we will use  *CUSTOM MAP REDUCE PROGRAM Where we will do
the filter operation then load*

CASE-3    DATA from HDFS to RDBMS(Whole data): *SQOOP*

CASE-4   DATA from HDFS to RDBMS(conditional data): *SQOOP*

CASE-5   SOME DATA  FROM RDBMS and SOME DATA FROM HDFS then do filter and
load into HDFS : *JDBC WITH Map/Reduce program*
Note: Can any one suggest me, if I am wrong and we need to do something
other then this, which will be easy to do .
Regards,

samir.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB