|
|
-
question on indexes in RDBMS vs. noSQL self created indexes...(disk space wise)
Hiller, Dean 2010-12-21, 23:36
I was asked a question about a concern on having indexes and in one case having to duplicate 7 times the data if we move from RDBMS to noSQL DB.
My reply that I wanted to get feedback on(please let me know if I am dead wrong or what else I may be missing) was
1. It's a column based sparse table so null's take up no space(ie. More room when we need to duplicate)
2. Indexes take up space in an RDBMS already and are essentially duplication in your old RDBMS anyways
3. The designs will be quite a bit different eliminating the need for those indexes(maybe we only have 3 later out of the 7, and the indexes in hbase are a bit bigger than indexes in the old RDBMS too???)
Thanks for any feedback here
Dean This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of the message is not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail and delete the message and any attachments from your system.
-
RE: question on indexes in RDBMS vs. noSQL self created indexes...(disk space wise)
Jonathan Gray 2010-12-22, 01:44
> 1. It's a column based sparse table so null's take up no space(ie. > More room when we need to duplicate)
Correct. Nulls take up no space.
> 2. Indexes take up space in an RDBMS already and are essentially > duplication in your old RDBMS anyways
Secondary indexes in an RDBMS use additional space. Primary indexes may not depending on the db.
> 3. The designs will be quite a bit different eliminating the need > for those indexes(maybe we only have 3 later out of the 7, and the indexes > in hbase are a bit bigger than indexes in the old RDBMS too???)
Designs will most likely be different. Number of indexes may not be the same. Hard to say more without knowing the specifics.
Hard to say what will be bigger where. HBase "indexes" (really just tables) are generally highly compressible. This is generally not the case for RDBMS indexes.
An additional point about HBase vs. RDBMS when talking about disk space is that HBase will work just fine on regular 7.2k RPM drives whereas good performance from RDBMS indexes often require higher end 15k RPM drives (cost per gigabyte is MUCH higher on these drives).
> > > > Thanks for any feedback here > > Dean > > > This message and any attachments are intended only for the use of the > addressee and may contain information that is privileged and confidential. If > the reader of the message is not the intended recipient or an authorized > representative of the intended recipient, you are hereby notified that any > dissemination of this communication is strictly prohibited. If you have > received this communication in error, please notify us immediately by e-mail > and delete the message and any attachments from your system.
-
Re: question on indexes in RDBMS vs. noSQL self created indexes...(disk space wise)
Hari Sreekumar 2010-12-22, 04:38
A related question JG, do null column families take space? e.g, what if I create a column family which gets filles only in like 1 in a million rows and remains empty otherwise?
thanks, hari
On Wed, Dec 22, 2010 at 7:14 AM, Jonathan Gray <[EMAIL PROTECTED]> wrote:
> > > 1. It's a column based sparse table so null's take up no space(ie. > > More room when we need to duplicate) > > Correct. Nulls take up no space. > > > 2. Indexes take up space in an RDBMS already and are essentially > > duplication in your old RDBMS anyways > > Secondary indexes in an RDBMS use additional space. Primary indexes may > not depending on the db. > > > 3. The designs will be quite a bit different eliminating the need > > for those indexes(maybe we only have 3 later out of the 7, and the > indexes > > in hbase are a bit bigger than indexes in the old RDBMS too???) > > Designs will most likely be different. Number of indexes may not be the > same. Hard to say more without knowing the specifics. > > Hard to say what will be bigger where. HBase "indexes" (really just > tables) are generally highly compressible. This is generally not the case > for RDBMS indexes. > > An additional point about HBase vs. RDBMS when talking about disk space is > that HBase will work just fine on regular 7.2k RPM drives whereas good > performance from RDBMS indexes often require higher end 15k RPM drives (cost > per gigabyte is MUCH higher on these drives). > > > > > > > > > Thanks for any feedback here > > > > Dean > > > > > > This message and any attachments are intended only for the use of the > > addressee and may contain information that is privileged and > confidential. If > > the reader of the message is not the intended recipient or an authorized > > representative of the intended recipient, you are hereby notified that > any > > dissemination of this communication is strictly prohibited. If you have > > received this communication in error, please notify us immediately by > e-mail > > and delete the message and any attachments from your system. > >
|
|
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by
Sematext