Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Porting SQL DB into HBASE


Copy link to this message
-
Re: Porting SQL DB into HBASE
You are mentioning 2 different reasons:

Open source... Well, get MySQL..

Large datasets? The table sizes that you reported in the earlier mails dont
seem to justify a move to HBase. Keep in mind - to run HBase stably in
production you would ideally want to have atleast 10 nodes. And you will
have no SQL available. Make sure you are aware of the trade-offs between
HBase v/s RDBMS before you decide... Even 100 millions rows can be handled
by a relational database if it is tuned properly.
Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz
On Mon, Apr 12, 2010 at 10:17 PM, kranthi reddy <[EMAIL PROTECTED]>wrote:

> Hi all,
>
>
> @Amandeep : The main reason for porting to Hbase is that it is an open
> source. Currently the NGO is paying high licensing fee for Microsoft Sql
> server. So in order to save money we planned to port to Hbase because of
> scalability for large datasets.
>
> @Jonathan : The problem is that these static tables can't be combined. Each
> table describes about different entities. For Eg: One static table might
> contain information about all the counties in a country. And another table
> might contain information all the doctors present in the country.
>
> That is the reason why I don't think it is possible to combine these static
> tables as they don't have any primary/foreign keys referencing others.
>
> The dynamic tables are pretty huge (small when compared to what Hbase can
> support). But these tables will be expanded and might contain upto 100
> million in the coming future.
>
> Thank you,
> kranthi
>
> On Tue, Apr 13, 2010 at 12:17 AM, Michael Segel
> <[EMAIL PROTECTED]>wrote:
>
> >
> >
> > Just an idea, take a look at a hierarchical design like Pick.
> > I know its doable, but I don't know how well it will perform.
> >
> >
> > > Date: Mon, 12 Apr 2010 14:25:48 +0530
> > > Subject: Re: Porting SQL DB into HBASE
> > > From: [EMAIL PROTECTED]
> > > To: [EMAIL PROTECTED]
> > >
> > > HI jonathan,
> > >
> > > Sorry for the late response. Missed your reply.
> > >
> > > The problem is, around 80% (400) of the tables are static tables and
> the
> > > remaining 20% (100) are dynamic tables that are updated on a daily
> basis.
> > > The problem is denormalising these 20% tables is also extremely
> difficult
> > > and we are planning to port them directly into hbase. And also
> > denormalising
> > > these tables would lead to a lot of redundant data.
> > >
> > > Static tables have number of entries varying in hundreds and mostly
> less
> > > than 1000 entries (rows). Where as the dynamic tables have more than
> > 20,000
> > > entries and each entry might be updated/modified at least once in a
> week.
> > >
> > > Regards,
> > > kranthi
> > >
> > >
> > > On Wed, Mar 31, 2010 at 10:23 PM, Jonathan Gray <[EMAIL PROTECTED]>
> > wrote:
> > >
> > > > Kranthi,
> > > >
> > > > HBase can handle a good number of tables, but tens or maybe a
> hundred.
> >  If
> > > > you have 500 tables you should definitely be rethinking your schema
> > design.
> > > >  The issue is less about HBase being able to handle lots of tables,
> and
> > much
> > > > more about whether scattering your data across lots of tables will be
> > > > performant at read time.
> > > >
> > > >
> > > > 1)  Impossible to answer that question without knowing the schemas of
> > the
> > > > existing tables.
> > > >
> > > > 2)  Not really any relation between fault tolerance and the number of
> > > > tables except potentially for recovery time but this would be the
> same
> > with
> > > > few, very large tables.
> > > >
> > > > 3)  No difference in write performance.  Read performance if doing
> > simple
> > > > key lookups would not be impacted, but most like having data spread
> out
> > like
> > > > this will mean you'll need joins of some sort.
> > > >
> > > > Can you tell more about your data and queries?
> > > >
> > > > JG
> > > >
> > > > > -----Original Message-----
> > > > > From: kranthi reddy [mailto:[EMAIL PROTECTED]]