Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Insert into tall table 50% faster than wide table


Copy link to this message
-
Re: Insert into tall table 50% faster than wide table
Bryan Keller 2010-12-23, 00:16
It appears to be the same or better, not to derail my original question. The much slower write performance will cause problems for me unless I can resolve that.

On Dec 22, 2010, at 3:52 PM, Peter Haidinyak wrote:

> Interesting, do you know what the time difference would be on the other side, doing a lookup/scan?
>
> Thanks
>
> -Pete
>
> -----Original Message-----
> From: Bryan Keller [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, December 22, 2010 3:41 PM
> To: [EMAIL PROTECTED]
> Subject: Insert into tall table 50% faster than wide table
>
> I have been testing a couple of different approaches to storing customer orders. One is a tall table, where each order is a row. The other is a wide table where each customer is a row, and orders are columns in the row. I am finding that inserts into the tall table, i.e. adding rows for every order, is roughly 50% faster than inserts into the wide table, i.e. adding a row for a customer and then adding columns for orders.
>
> In my test, there are 10,000 customers, each customer has 600 orders and each order has 10 columns. The tall table approach results in 6 mil rows of 10 columns. The wide table approach results is 10,000 rows of 6,000 columns. I'm using hbase 0.89-20100924 and hadoop 0.20.2. I am adding the orders using a Put for each order, submitted in batches of 1000 as a list of Puts.
>
> Are there techniques to speed up inserts with the wide table approach that I am perhaps overlooking?
>