Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # user - Hadoop with Sharded MySql


+
Srinivas Surasani 2012-05-31, 22:02
+
Edward Capriolo 2012-06-01, 00:12
Copy link to this message
-
Re: Hadoop with Sharded MySql
Sujit Dhamale 2012-06-01, 04:52
Hi ,
instead of pulling 70K tables from mysql into hdfs.
take dump of all 30 table and put in to hBase data base .

if you pulled 70K tables from mysql into hdfs , you need to use Hive , but
modification will not possible in Hive :(

*@ common-user :* please correct me , if i am wrong .

Kind Regards
Sujit Dhamale
(+91 9970086652)
On Fri, Jun 1, 2012 at 5:42 AM, Edward Capriolo <[EMAIL PROTECTED]>wrote:

> Maybe you can do some VIEWs or unions or merge tables on the mysql
> side to overcome the aspect of launching so many sqoop jobs.
>
> On Thu, May 31, 2012 at 6:02 PM, Srinivas Surasani
> <[EMAIL PROTECTED]> wrote:
> > All,
> >
> > We are trying to implement sqoop in our environment which has 30 mysql
> > sharded databases and all the databases have around 30 databases with
> > 150 tables in each of the database which are all sharded (horizontally
> > sharded that means the data is divided into all the tables in mysql).
> >
> > The problem is that we have a total of around 70K tables which needed
> > to be pulled from mysql into hdfs.
> >
> > So, my question is that generating 70K sqoop commands and running them
> > parallel is feasible or not?
> >
> > Also, doing incremental updates is going to be like invoking 70K
> > another sqoop jobs which intern kick of map-reduce jobs.
> >
> > The main problem is monitoring and managing this huge number of jobs?
> >
> > Can anyone suggest me the best way of doing it or is sqoop a good
> > candidate for this type of scenario?
> >
> > Currently the same process is done by generating tsv files  mysql
> > server and dumped into staging server and  from there we'll generate
> > hdfs put statements..
> >
> > Appreciate your suggestions !!!
> >
> >
> > Thanks,
> > Srinivas Surasani
>
+
anil gupta 2012-06-01, 05:27
+
Srinivas Surasani 2012-06-01, 16:29
+
Michael Segel 2012-06-02, 00:09