Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Ideal row size


Copy link to this message
-
Re: Ideal row size
Hi Eric,

An ideal cell size would probably be the size of a block, so 64KB
including the keys. Having bigger cells would inflate the size of your
blocks but then you'd be outside of the normal HBase settings.

That, and do some experiments.

J-D

On Tue, Aug 7, 2012 at 6:35 AM, Eric Czech <[EMAIL PROTECTED]> wrote:
> Hello everyone,
>
> I'm trying to store many small values in indexes created via MR jobs,
> and I was hoping to get some advice on how to structure my rows.
> Essentially, I have complete control over how large the rows should be
> as the values are small, consistent in size, and can be grouped
> together in any way I'd like.  My question then is, what's the ideal
> size for a row in Hbase, in bytes?  I'm trying to determine how to
> group my values together into larger values, and I think having a
> target size to hit would make that a lot easier.
>
> I know fewer rows is generally better to avoid the repetitive storage
> of keys, cfs, and qualifiers provided that those rows still suit a
> given application, but I'm not sure at what point the scale will tip
> in the other direction and I'll start to see undue memory pressure or
> compaction issues with rows that are too large.
>
> Thanks in advance!
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB