Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Re: 0.92 and Read/writes not scaling


+
Juhani Connolly 2012-04-03, 02:52
+
Jonathan Hsieh 2012-04-03, 00:21
+
Juhani Connolly 2012-04-03, 03:02
+
Jonathan Hsieh 2012-04-03, 03:19
+
Stack 2012-04-03, 03:50
+
Jonathan Hsieh 2012-04-03, 16:56
+
Stack 2012-04-03, 17:42
+
Juhani Connolly 2012-04-05, 01:45
+
Ted Yu 2012-04-05, 02:43
+
Juhani Connolly 2012-04-05, 03:02
Copy link to this message
-
Re: 0.92 and Read/writes not scaling
To close the loop on this thread, we were able to track down the
issue. See https://issues.apache.org/jira/browse/HDFS-3280 - just
committed it in HDFS.

It's a simple patch if you want to patch your own build. Otherwise
this should show up in CDH4 nightly builds tonight, and I think in
CDH4b2 as well.

If you want to patch on the HBase side, you can edit HLog.java to
remove the checks for the "sync" method, and have it only call
"hflush". It's only the compatibility path that caused the problem.

Thanks
-Todd

On Wed, Apr 4, 2012 at 8:02 PM, Juhani Connolly
<[EMAIL PROTECTED]> wrote:
> done, thanks for pointing me to that
>
>
> On 04/05/2012 11:43 AM, Ted Yu wrote:
>>
>> Juhani:
>> Thanks for sharing your results.
>>
>> Do you mind putting the summary on HBASE-5699: Run with>  1 WAL in
>> HRegionServer ?
>>
>> On Wed, Apr 4, 2012 at 6:45 PM, Juhani Connolly<
>> [EMAIL PROTECTED]>  wrote:
>>
>>> another quick update on stuff:
>>>
>>> since moving back to hdfs 0.20.2 (with hbase still at 0.92), we found
>>> that
>>> while we made significant gains in throughput, that most of our
>>> regionservers IPC threads were stuck somewhere in HWal.append(out of 50,
>>> 42
>>> were in append, of which 20 were in sync), limiting throughput despite
>>> significant free hardware resources.
>>>
>>> Because the WAL writes of a single  RS all go sequentially to one HDFS
>>> file, we assumed that we could improve throughput by separating writes to
>>> more WAL files and more HDs. To do this we ran multiple region servers on
>>> each node.
>>>
>>> The scaling  wasn't linear(we were in no way increasing hardware, just
>>> the
>>> number of regionservers), but we are now getting significantly more
>>> throughput.
>>> I would personally not say that this is a great approach to have to take,
>>> it would generally be better to build more smaller servers which will
>>> thus
>>> not limit themselves by trying to put a lot of data per server through a
>>> single WAL file.
>>>
>>> Of course there may be another solution to this that I'm not aware of? If
>>> so I'd love to hear it.
>>>
>

--
Todd Lipcon
Software Engineer, Cloudera
+
Stack 2012-04-14, 04:06
+
Jonathan Hsieh 2012-04-14, 12:43
+
Todd Lipcon 2012-04-15, 06:28
+
Alok Singh 2012-04-02, 17:15
+
Juhani Connolly 2012-04-03, 02:56
+
Stack 2012-04-02, 17:41
+
Juhani Connolly 2012-03-19, 10:41
+
Ramkrishna.S.Vasudevan 2012-03-19, 11:03
+
Mingjian Deng 2012-03-19, 11:56
+
Juhani Connolly 2012-03-19, 12:02
+
Ramkrishna.S.Vasudevan 2012-03-19, 12:27
+
Christian Schäfer 2012-03-19, 12:21
+
Juhani Connolly 2012-03-19, 12:31
+
Matt Corgan 2012-03-19, 16:55
+
Juhani Connolly 2012-03-21, 03:09
+
Mikael Sitruk 2012-03-21, 05:29
+
Juhani Connolly 2012-03-23, 07:40
+
Stack 2012-03-26, 16:37
+
Juhani Connolly 2012-03-23, 07:48
+
Matt Corgan 2012-03-26, 13:58
+
Stack 2012-03-26, 16:42
+
Juhani Connolly 2012-03-26, 17:08
+
Todd Lipcon 2012-03-27, 01:43
+
Juhani Connolly 2012-03-27, 03:18
+
Juhani Connolly 2012-03-28, 08:27
+
Buckley,Ron 2012-03-28, 12:41
+
Stack 2012-03-28, 17:12
+
Buckley,Ron 2012-03-28, 17:56
+
Juhani Connolly 2012-03-29, 02:36
+
Juhani Connolly 2012-03-26, 16:48
+
Mikael Sitruk 2012-03-26, 14:21
+
Juhani Connolly 2012-03-26, 16:59
+
Stack 2012-03-20, 19:27
+
Juhani Connolly 2012-03-19, 11:09
+
Stack 2012-03-26, 16:29
+
Juhani Connolly 2012-03-26, 17:02