Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - How many column families in one table ?


Copy link to this message
-
Re: How many column families in one table ?
Vimal Jain 2013-07-01, 17:03
Hi Lars,
1)I have around 140 columns for each row , out of 140 , around 100 rows are
holds java primitive data type , remaining 40 rows contains serialized java
object as byte array. Yes , I do delete data but the frequency is very less
( 1 out of 5K operations ). I dont run any compaction.
2) I had ran scan keeping in mind the CPU,IO and other system related
parameters.I found them to be normal with system load being 0.1-0.3.
3) Yes i have 3 versions of cell ( default value).
On Mon, Jul 1, 2013 at 9:08 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> The performance you're seeing is definitely not typical. 'couple of
> further questions:
> - How large are your KVs (columns)?- Do you delete data? Do you run major
> compactions?
> - Can you measure: CPU, IO, context switches, etc, during the scanning?
> - Do you have many versions of the columns?
>
>
> Note that HBase is a key value store, i.e. the storage is sparse. Each
> column is represented by its own key value pair, and HBase has to do the
> work to reassemble the data.
>
>
> -- Lars
>
>
>
> ________________________________
>  From: Vimal Jain <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Sent: Monday, July 1, 2013 4:44 AM
> Subject: Re: How many column families in one table ?
>
>
> Hi,
> We had some hardware constraints along with the fact that our total data
> size was in GBs.
> Thats why to start with Hbase ,  we first began  with pseudo distributed
> mode and thought if required we would upgrade to fully distributed mode.
>
>
>
> On Mon, Jul 1, 2013 at 5:09 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
>
> > bq. I have configured Hbase in pseudo distributed mode on top of HDFS.
> >
> > What was the reason for using pseudo distributed mode in production
> setup ?
> >
> > Cheers
> >
> > On Mon, Jul 1, 2013 at 1:44 AM, Vimal Jain <[EMAIL PROTECTED]> wrote:
> >
> > > Thanks Dhaval/Michael/Ted/Otis for your replies.
> > > Actually , i asked this question because i am seeing some performance
> > > degradation in my production Hbase setup.
> > > I have configured Hbase in pseudo distributed mode on top of HDFS.
> > > I have created 17 Column families :( . I am actually using 14 out of
> > these
> > > 17 column families.
> > > Each column family has around on average 8-10 column qualifiers so
> total
> > > around 140 columns are there for each row key.
> > > I have around 1.6 millions rows in the table.
> > > To completely scan the table for all 140 columns  , it takes around
> 30-40
> > > minutes.
> > > Is it normal or Should i redesign my table schema ( probably merging
> 4-5
> > > column families into one , so that at the end i have just 3-4 cf ) ?
> > >
> > >
> > >
> > > On Sat, Jun 29, 2013 at 12:06 AM, Otis Gospodnetic <
> > > [EMAIL PROTECTED]> wrote:
> > >
> > > > Hm, works for me -
> > > >
> > > >
> > >
> >
> http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fb&subj=Re+HBase+Column+Family+Limit+Reasoning
> > > >
> > > > Shorter version: http://search-hadoop.com/m/qOx8l15Z1q42
> > > >
> > > > Otis
> > > > --
> > > > Solr & ElasticSearch Support -- http://sematext.com/
> > > > Performance Monitoring -- http://sematext.com/spm
> > > >
> > > >
> > > >
> > > > On Fri, Jun 28, 2013 at 8:40 AM, Vimal Jain <[EMAIL PROTECTED]>
> wrote:
> > > > > Hi All ,
> > > > > Thanks for your replies.
> > > > >
> > > > > Ted,
> > > > > Thanks for the link, but its not working . :(
> > > > >
> > > > >
> > > > > On Fri, Jun 28, 2013 at 5:57 PM, Ted Yu <[EMAIL PROTECTED]>
> wrote:
> > > > >
> > > > >> Vimal:
> > > > >> Please also refer to:
> > > > >>
> > > > >>
> > > >
> > >
> >
> http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fb&subj=Re+HBase+Column+Family+Limit+Reasoning
> > > > >>
> > > > >> On Fri, Jun 28, 2013 at 1:37 PM, Michel Segel <
> > > > [EMAIL PROTECTED]
> > > > >> >wrote:
> > > > >>
> > > > >> > Short answer... As few as possible.
> > > > >> >
> > > > >> > 14 CF doesn't make too much sense.
> > > > >> >
> > > > >> > Sent from a remote device. Please excuse any typos...

Thanks and Regards,
Vimal Jain