Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> HBase Writes With Large Number of Columns


+
Pankaj Misra 2013-03-25, 16:55
+
Ted Yu 2013-03-25, 16:59
+
Pankaj Misra 2013-03-25, 17:18
+
Ted Yu 2013-03-25, 17:45
+
Pankaj Misra 2013-03-25, 18:03
+
Ted Yu 2013-03-25, 18:24
+
Jean-Marc Spaggiari 2013-03-25, 18:27
Copy link to this message
-
RE: HBase Writes With Large Number of Columns
Firstly, Thanks a lot Jean and Ted for your extended help, very much appreciate it.

Yes Ted I am writing to all the 40 columns and 1.5 Kb of record data is distributed across these columns.

Jean, some columns are storing as small as a single byte value, while few of the columns are storing as much as 80-125 bytes of data. The overall record size is 1.5 KB. These records are being written using batch mutation with thrift API, where in we are writing 100 records per batch mutation.

Thanks and Regards
Pankaj Misra
________________________________________
From: Jean-Marc Spaggiari [[EMAIL PROTECTED]]
Sent: Monday, March 25, 2013 11:57 PM
To: [EMAIL PROTECTED]
Subject: Re: HBase Writes With Large Number of Columns

I just ran some LoadTest to see if I can reproduce that.

bin/hbase org.apache.hadoop.hbase.util.LoadTestTool -write 4:512:100
-num_keys 1000000
13/03/25 14:18:25 INFO util.MultiThreadedAction: [W:100] Keys=997172,
cols=3,8m, time=00:03:55 Overall: [keys/s= 4242, latency=23 ms]
Current: [keys/s=4413, latency=22 ms], insertedUpTo=-1

bin/hbase org.apache.hadoop.hbase.util.LoadTestTool -write 100:512:100
-num_keys 1000000

This one crashed because I don't have enought disk space, so I'm
re-running it, but just before it crashed it was showing about 24.5
slower. which is coherent since it's writing 25 more columns.

What size of data do you have? Big cells? Small cells? I will retry
the test above with more lines and keep you posted.

2013/3/25 Pankaj Misra <[EMAIL PROTECTED]>:
> Yes Ted, you are right, we are having table regions pre-split, and we see that both regions are almost evenly filled in both the tests.
>
> This does not seem to be a regression though, since we were getting good write rates when we had lesser number of columns.
>
> Thanks and Regards
> Pankaj Misra
>
>
> ________________________________________
> From: Ted Yu [[EMAIL PROTECTED]]
> Sent: Monday, March 25, 2013 11:15 PM
> To: [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED]
> Subject: Re: HBase Writes With Large Number of Columns
>
> Copying Ankit who raised the same question soon after Pankaj's initial
> question.
>
> On one hand I wonder if this was a regression in 0.94.5 (though unlikely).
>
> Did the region servers receive (relatively) same write load for the second
> test case ? I assume you have pre-split your tables in both cases.
>
> Cheers
>
> On Mon, Mar 25, 2013 at 10:18 AM, Pankaj Misra
> <[EMAIL PROTECTED]>wrote:
>
>> Hi Ted,
>>
>> Sorry for missing that detail, we are using HBase version 0.94.5
>>
>> Regards
>> Pankaj Misra
>>
>>
>> ________________________________________
>> From: Ted Yu [[EMAIL PROTECTED]]
>> Sent: Monday, March 25, 2013 10:29 PM
>> To: [EMAIL PROTECTED]
>> Subject: Re: HBase Writes With Large Number of Columns
>>
>> If you give us the version of HBase you're using, that would give us some
>> more information to help you.
>>
>> Cheers
>>
>> On Mon, Mar 25, 2013 at 9:55 AM, Pankaj Misra <[EMAIL PROTECTED]
>> >wrote:
>>
>> > Hi,
>> >
>> > The issue that I am facing is around the performance drop of Hbase, when
>> I
>> > was having 20 columns in a column family Vs now when I am having 40
>> columns
>> > in a column family. The number of columns have doubled and the
>> > ingestion/write speed has also dropped by half. I am writing 1.5 KB of
>> data
>> > per row across 40 columns.
>> >
>> > Are there any settings that I should look into for tweaking Hbase to
>> write
>> > higher number of columns faster?
>> >
>> > I would request community's help to let me know how can I write to a
>> > column family with large number of columns efficiently.
>> >
>> > Would greatly appreciate any help /clues around this issue.
>> >
>> > Thanks and Regards
>> > Pankaj Misra
>> >
>> > ________________________________
>> >
>> >
>> >
>> >
>> >
>> >
>> > NOTE: This message may contain information that is confidential,
>> > proprietary, privileged or otherwise protected by law. The message is

________________________________
NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
+
Ted Yu 2013-03-25, 19:39
+
Pankaj Misra 2013-03-25, 20:54
+
Jean-Marc Spaggiari 2013-03-25, 23:49
+
ramkrishna vasudevan 2013-03-26, 06:19
+
Asaf Mesika 2013-03-27, 21:52
+
Ted Yu 2013-03-27, 22:06
+
Asaf Mesika 2013-03-27, 22:28
+
Ted Yu 2013-03-27, 22:33
+
Mohammad Tariq 2013-03-25, 19:30