Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> HBase Writes With Large Number of Columns


+
Pankaj Misra 2013-03-25, 16:55
+
Ted Yu 2013-03-25, 16:59
+
Pankaj Misra 2013-03-25, 17:18
+
Ted Yu 2013-03-25, 17:45
+
Pankaj Misra 2013-03-25, 18:03
+
Ted Yu 2013-03-25, 18:24
+
Jean-Marc Spaggiari 2013-03-25, 18:27
+
Pankaj Misra 2013-03-25, 18:40
+
Ted Yu 2013-03-25, 19:39
+
Pankaj Misra 2013-03-25, 20:54
+
Jean-Marc Spaggiari 2013-03-25, 23:49
+
ramkrishna vasudevan 2013-03-26, 06:19
+
Asaf Mesika 2013-03-27, 21:52
+
Ted Yu 2013-03-27, 22:06
+
Asaf Mesika 2013-03-27, 22:28
+
Ted Yu 2013-03-27, 22:33
Copy link to this message
-
Re: HBase Writes With Large Number of Columns
Hello Pankaj,

     What is the configuration which you are using?Also, the H/W specs?
Maybe tuning some of these would make things faster. Although amount
of data being inserted is small, the amount of metadata being generated
would be higher. Now, you have to generate the key+qualifier+TS triplet
for 40 cells, against 20 as in the earlier case.

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com
On Tue, Mar 26, 2013 at 12:10 AM, Pankaj Misra
<[EMAIL PROTECTED]>wrote:

> Firstly, Thanks a lot Jean and Ted for your extended help, very much
> appreciate it.
>
> Yes Ted I am writing to all the 40 columns and 1.5 Kb of record data is
> distributed across these columns.
>
> Jean, some columns are storing as small as a single byte value, while few
> of the columns are storing as much as 80-125 bytes of data. The overall
> record size is 1.5 KB. These records are being written using batch mutation
> with thrift API, where in we are writing 100 records per batch mutation.
>
> Thanks and Regards
> Pankaj Misra
>
>
> ________________________________________
> From: Jean-Marc Spaggiari [[EMAIL PROTECTED]]
> Sent: Monday, March 25, 2013 11:57 PM
> To: [EMAIL PROTECTED]
> Subject: Re: HBase Writes With Large Number of Columns
>
> I just ran some LoadTest to see if I can reproduce that.
>
> bin/hbase org.apache.hadoop.hbase.util.LoadTestTool -write 4:512:100
> -num_keys 1000000
> 13/03/25 14:18:25 INFO util.MultiThreadedAction: [W:100] Keys=997172,
> cols=3,8m, time=00:03:55 Overall: [keys/s= 4242, latency=23 ms]
> Current: [keys/s=4413, latency=22 ms], insertedUpTo=-1
>
> bin/hbase org.apache.hadoop.hbase.util.LoadTestTool -write 100:512:100
> -num_keys 1000000
>
> This one crashed because I don't have enought disk space, so I'm
> re-running it, but just before it crashed it was showing about 24.5
> slower. which is coherent since it's writing 25 more columns.
>
> What size of data do you have? Big cells? Small cells? I will retry
> the test above with more lines and keep you posted.
>
> 2013/3/25 Pankaj Misra <[EMAIL PROTECTED]>:
> > Yes Ted, you are right, we are having table regions pre-split, and we
> see that both regions are almost evenly filled in both the tests.
> >
> > This does not seem to be a regression though, since we were getting good
> write rates when we had lesser number of columns.
> >
> > Thanks and Regards
> > Pankaj Misra
> >
> >
> > ________________________________________
> > From: Ted Yu [[EMAIL PROTECTED]]
> > Sent: Monday, March 25, 2013 11:15 PM
> > To: [EMAIL PROTECTED]
> > Cc: [EMAIL PROTECTED]
> > Subject: Re: HBase Writes With Large Number of Columns
> >
> > Copying Ankit who raised the same question soon after Pankaj's initial
> > question.
> >
> > On one hand I wonder if this was a regression in 0.94.5 (though
> unlikely).
> >
> > Did the region servers receive (relatively) same write load for the
> second
> > test case ? I assume you have pre-split your tables in both cases.
> >
> > Cheers
> >
> > On Mon, Mar 25, 2013 at 10:18 AM, Pankaj Misra
> > <[EMAIL PROTECTED]>wrote:
> >
> >> Hi Ted,
> >>
> >> Sorry for missing that detail, we are using HBase version 0.94.5
> >>
> >> Regards
> >> Pankaj Misra
> >>
> >>
> >> ________________________________________
> >> From: Ted Yu [[EMAIL PROTECTED]]
> >> Sent: Monday, March 25, 2013 10:29 PM
> >> To: [EMAIL PROTECTED]
> >> Subject: Re: HBase Writes With Large Number of Columns
> >>
> >> If you give us the version of HBase you're using, that would give us
> some
> >> more information to help you.
> >>
> >> Cheers
> >>
> >> On Mon, Mar 25, 2013 at 9:55 AM, Pankaj Misra <
> [EMAIL PROTECTED]
> >> >wrote:
> >>
> >> > Hi,
> >> >
> >> > The issue that I am facing is around the performance drop of Hbase,
> when
> >> I
> >> > was having 20 columns in a column family Vs now when I am having 40
> >> columns
> >> > in a column family. The number of columns have doubled and the
> >> > ingestion/write speed has also dropped by half. I am writing 1.5 KB of
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB