Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> How many column families in one table ?


Copy link to this message
-
Re: How many column families in one table ?
Hi Vimal,

  It really depends on your usage pattern but HBase != Bigtable.
On Aug 4, 2013 2:29 AM, "Vimal Jain" <[EMAIL PROTECTED]> wrote:

> Hi,
> I have tested read performance after reducing number of column families
> from 14 to 3 and yes there is improvement.
> Meanwhile i was going through the paper published by google on BigTable.
> It says
>
> "It is our intent that the number of distinct column
> families in a table be small (in the hundreds at most), and
> that families rarely change during operation."
>
> So Is that theoretical value ( 100 CFs )  or its possible but not with the
> current version of Hbase ?
>
>
> On Tue, Jul 2, 2013 at 12:48 AM, Viral Bajaria <[EMAIL PROTECTED]
> >wrote:
>
> > On Mon, Jul 1, 2013 at 10:06 AM, Vimal Jain <[EMAIL PROTECTED]> wrote:
> >
> > > Sorry for the typo .. please ignore previous mail.. Here is the
> corrected
> > > one..
> > > 1)I have around 140 columns for each row , out of 140 , around 100
> > columns
> > > hold java primitive data type , remaining 40 columns  contain
> serialized
> > > java object as byte array(Inside each object is an ArrayList). Yes , I
> do
> > > delete data but the frequency is very less ( 1 out of 5K operations ).
> I
> > > dont run any compaction.
> > >
> >
> > This answers the type of data in each cell not the size of data. Can you
> > figure out the average size of data that you insert in that size. For
> > example what is the length of the byte array ? Also for java primitive,
> is
> > it 8-byte long ? 4-byte int ?
> > In addition to that, what is in the row key ? How long is that in bytes ?
> > Same for column family, can you share the names of the column family ?
> How
> > about qualifiers ?
> >
> > If you have disabled major compactions, you should run it once a few days
> > (if not once a day) to consolidate the # of files that each scan will
> have
> > to open.
> >
> > 2) I had ran scan keeping in mind the CPU,IO and other system related
> > > parameters.I found them to be normal with system load being 0.1-0.3.
> > >
> >
> > How many disks do you have in your box ? Have you ever benchmarked the
> > hardware ?
> >
> > Thanks,
> > Viral
> >
>
>
>
> --
> Thanks and Regards,
> Vimal Jain
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB