Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> question on indexes in RDBMS vs. noSQL self created indexes...(disk space wise)

Copy link to this message
RE: question on indexes in RDBMS vs. noSQL self created indexes...(disk space wise)

> 1.       It's a column based sparse table so null's take up no space(ie.
> More room when we need to duplicate)

Correct.  Nulls take up no space.

> 2.       Indexes take up space in an RDBMS already and are essentially
> duplication in your old RDBMS anyways

Secondary indexes in an RDBMS use additional space.  Primary indexes may not depending on the db.

> 3.       The designs will be quite a bit different eliminating the need
> for those indexes(maybe we only have 3 later out of the 7, and the indexes
> in hbase are a bit bigger than indexes in the old RDBMS too???)

Designs will most likely be different.  Number of indexes may not be the same.  Hard to say more without knowing the specifics.

Hard to say what will be bigger where.  HBase "indexes" (really just tables) are generally highly compressible.  This is generally not the case for RDBMS indexes.

An additional point about HBase vs. RDBMS when talking about disk space is that HBase will work just fine on regular 7.2k RPM drives whereas good performance from RDBMS indexes often require higher end 15k RPM drives (cost per gigabyte is MUCH higher on these drives).

> Thanks for any feedback here
> Dean
> This message and any attachments are intended only for the use of the
> addressee and may contain information that is privileged and confidential. If
> the reader of the message is not the intended recipient or an authorized
> representative of the intended recipient, you are hereby notified that any
> dissemination of this communication is strictly prohibited. If you have
> received this communication in error, please notify us immediately by e-mail
> and delete the message and any attachments from your system.