Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - How many column families in one table ?


Copy link to this message
-
Re: How many column families in one table ?
Kevin O'dell 2013-08-04, 14:44
Hi Vimal,

  It really depends on your usage pattern but HBase != Bigtable.
On Aug 4, 2013 2:29 AM, "Vimal Jain" <[EMAIL PROTECTED]> wrote:

> Hi,
> I have tested read performance after reducing number of column families
> from 14 to 3 and yes there is improvement.
> Meanwhile i was going through the paper published by google on BigTable.
> It says
>
> "It is our intent that the number of distinct column
> families in a table be small (in the hundreds at most), and
> that families rarely change during operation."
>
> So Is that theoretical value ( 100 CFs )  or its possible but not with the
> current version of Hbase ?
>
>
> On Tue, Jul 2, 2013 at 12:48 AM, Viral Bajaria <[EMAIL PROTECTED]
> >wrote:
>
> > On Mon, Jul 1, 2013 at 10:06 AM, Vimal Jain <[EMAIL PROTECTED]> wrote:
> >
> > > Sorry for the typo .. please ignore previous mail.. Here is the
> corrected
> > > one..
> > > 1)I have around 140 columns for each row , out of 140 , around 100
> > columns
> > > hold java primitive data type , remaining 40 columns  contain
> serialized
> > > java object as byte array(Inside each object is an ArrayList). Yes , I
> do
> > > delete data but the frequency is very less ( 1 out of 5K operations ).
> I
> > > dont run any compaction.
> > >
> >
> > This answers the type of data in each cell not the size of data. Can you
> > figure out the average size of data that you insert in that size. For
> > example what is the length of the byte array ? Also for java primitive,
> is
> > it 8-byte long ? 4-byte int ?
> > In addition to that, what is in the row key ? How long is that in bytes ?
> > Same for column family, can you share the names of the column family ?
> How
> > about qualifiers ?
> >
> > If you have disabled major compactions, you should run it once a few days
> > (if not once a day) to consolidate the # of files that each scan will
> have
> > to open.
> >
> > 2) I had ran scan keeping in mind the CPU,IO and other system related
> > > parameters.I found them to be normal with system load being 0.1-0.3.
> > >
> >
> > How many disks do you have in your box ? Have you ever benchmarked the
> > hardware ?
> >
> > Thanks,
> > Viral
> >
>
>
>
> --
> Thanks and Regards,
> Vimal Jain
>