Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> How many column families in one table ?


Copy link to this message
-
Re: How many column families in one table ?
Sorry for the typo .. please ignore previous mail.. Here is the corrected
one..
1)I have around 140 columns for each row , out of 140 , around 100 columns
hold java primitive data type , remaining 40 columns  contain serialized
java object as byte array(Inside each object is an ArrayList). Yes , I do
delete data but the frequency is very less ( 1 out of 5K operations ). I
dont run any compaction.
2) I had ran scan keeping in mind the CPU,IO and other system related
parameters.I found them to be normal with system load being 0.1-0.3.
3) Yes i have 3 versions of cell ( default value).

On Mon, Jul 1, 2013 at 10:33 PM, Vimal Jain <[EMAIL PROTECTED]> wrote:

> Hi Lars,
> 1)I have around 140 columns for each row , out of 140 , around 100 rows
> are holds java primitive data type , remaining 40 rows contains serialized
> java object as byte array. Yes , I do delete data but the frequency is very
> less ( 1 out of 5K operations ). I dont run any compaction.
> 2) I had ran scan keeping in mind the CPU,IO and other system related
> parameters.I found them to be normal with system load being 0.1-0.3.
> 3) Yes i have 3 versions of cell ( default value).
>
>
> On Mon, Jul 1, 2013 at 9:08 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>
>> The performance you're seeing is definitely not typical. 'couple of
>> further questions:
>> - How large are your KVs (columns)?- Do you delete data? Do you run major
>> compactions?
>> - Can you measure: CPU, IO, context switches, etc, during the scanning?
>> - Do you have many versions of the columns?
>>
>>
>> Note that HBase is a key value store, i.e. the storage is sparse. Each
>> column is represented by its own key value pair, and HBase has to do the
>> work to reassemble the data.
>>
>>
>> -- Lars
>>
>>
>>
>> ________________________________
>>  From: Vimal Jain <[EMAIL PROTECTED]>
>> To: [EMAIL PROTECTED]
>> Sent: Monday, July 1, 2013 4:44 AM
>> Subject: Re: How many column families in one table ?
>>
>>
>> Hi,
>> We had some hardware constraints along with the fact that our total data
>> size was in GBs.
>> Thats why to start with Hbase ,  we first began  with pseudo distributed
>> mode and thought if required we would upgrade to fully distributed mode.
>>
>>
>>
>> On Mon, Jul 1, 2013 at 5:09 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
>>
>> > bq. I have configured Hbase in pseudo distributed mode on top of HDFS.
>> >
>> > What was the reason for using pseudo distributed mode in production
>> setup ?
>> >
>> > Cheers
>> >
>> > On Mon, Jul 1, 2013 at 1:44 AM, Vimal Jain <[EMAIL PROTECTED]> wrote:
>> >
>> > > Thanks Dhaval/Michael/Ted/Otis for your replies.
>> > > Actually , i asked this question because i am seeing some performance
>> > > degradation in my production Hbase setup.
>> > > I have configured Hbase in pseudo distributed mode on top of HDFS.
>> > > I have created 17 Column families :( . I am actually using 14 out of
>> > these
>> > > 17 column families.
>> > > Each column family has around on average 8-10 column qualifiers so
>> total
>> > > around 140 columns are there for each row key.
>> > > I have around 1.6 millions rows in the table.
>> > > To completely scan the table for all 140 columns  , it takes around
>> 30-40
>> > > minutes.
>> > > Is it normal or Should i redesign my table schema ( probably merging
>> 4-5
>> > > column families into one , so that at the end i have just 3-4 cf ) ?
>> > >
>> > >
>> > >
>> > > On Sat, Jun 29, 2013 at 12:06 AM, Otis Gospodnetic <
>> > > [EMAIL PROTECTED]> wrote:
>> > >
>> > > > Hm, works for me -
>> > > >
>> > > >
>> > >
>> >
>> http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fb&subj=Re+HBase+Column+Family+Limit+Reasoning
>> > > >
>> > > > Shorter version: http://search-hadoop.com/m/qOx8l15Z1q42
>> > > >
>> > > > Otis
>> > > > --
>> > > > Solr & ElasticSearch Support -- http://sematext.com/
>> > > > Performance Monitoring -- http://sematext.com/spm
>> > > >
>> > > >
>> > > >
>> > > > On Fri, Jun 28, 2013 at 8:40 AM, Vimal Jain <[EMAIL PROTECTED]>

Thanks and Regards,
Vimal Jain
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB