Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Insert into tall table 50% faster than wide table


Copy link to this message
-
Re: Insert into tall table 50% faster than wide table
Andrey Stepachev 2010-12-23, 09:57
2010/12/23 Ted Dunning <[EMAIL PROTECTED]>

> But the tall table is FASTER than the wide table.
>

Opps. :).

Maybe you put more data? Do you using compression? (in case of prefixed
qualifiers you
get more data, that uuid can has comparable length as an order row)
>
> On Wed, Dec 22, 2010 at 11:14 PM, Andrey Stepachev <[EMAIL PROTECTED]>
> wrote:
>
> > I think row locks slows down here. Each row you inserted tries to aquire
> > lock, and then release it. Wide table has significally less rows, and
> much
> > less locks acquired during insert.
> >
> >
> > 2010/12/23 Bryan Keller <[EMAIL PROTECTED]>
> >
> > > I have been testing a couple of different approaches to storing
> customer
> > > orders. One is a tall table, where each order is a row. The other is a
> > wide
> > > table where each customer is a row, and orders are columns in the row.
> I
> > am
> > > finding that inserts into the tall table, i.e. adding rows for every
> > order,
> > > is roughly 50% faster than inserts into the wide table, i.e. adding a
> row
> > > for a customer and then adding columns for orders.
> > >
> > > In my test, there are 10,000 customers, each customer has 600 orders
> and
> > > each order has 10 columns. The tall table approach results in 6 mil
> rows
> > of
> > > 10 columns. The wide table approach results is 10,000 rows of 6,000
> > columns.
> > > I'm using hbase 0.89-20100924 and hadoop 0.20.2. I am adding the orders
> > > using a Put for each order, submitted in batches of 1000 as a list of
> > Puts.
> > >
> > > Are there techniques to speed up inserts with the wide table approach
> > that
> > > I am perhaps overlooking?
> > >
> > >
> >
>