Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Performance tuning


+
Kristoffer Sjögren 2013-12-21, 19:17
+
lars hofhansl 2013-12-21, 20:44
+
Kristoffer Sjögren 2013-12-21, 21:28
Copy link to this message
-
Re: Performance tuning
Thanks Kristoffer,

yeah, that's the right metric. I would put my bet on the slower network.
But you're also doing a select count(*) query in Phoenix, right? So nothing should really be sent across the network.

When you do the queries, can you check whether there is any network traffic?

-- Lars

________________________________
 From: Kristoffer Sjögren <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
Sent: Saturday, December 21, 2013 1:28 PM
Subject: Re: Performance tuning
 

@pradeep scanner caching should not be an issue since data transferred to
the client is tiny.

@lars Yes, the data might be small for this particular case :-)

I have checked everything I can think of on RS (CPU, network, Hbase
console, uptime etc) and nothing stands out, except for the pings (network
pings).
There are 5 regions on 7, 18, 19, and 23 the others have 4.
hdfsBlocksLocalityIndex=100 on all RS (was that the correct metric?)

-Kristoffer
On Sat, Dec 21, 2013 at 9:44 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> Hi Kristoffer,
> For this particular problem. Are many regions on the same RegionServers?
> Did you profile those RegionServers? Anything weird on that box?
> Pings slower might well be an issue. How's the data locality? (You can
> check on a RegionServer's overview page).
> If needed, you can issue a major compaction to reestablish local data on
> all RegionServers.
>
>
> 32 cores matched with only 4G of RAM is a bit weird, but with your tiny
> dataset it doesn't matter anyway.
>
> 10m rows across 96 regions is just about 100k rows per region. You won't
> see many of the nice properties for HBase.
> Try with 100m (or better 1bn rows). Then we're talking. For anything below
> this you wouldn't want to use HBase anyway.
> (100k rows I could scan on my phone with a Perl script in less than 1s)
>
>
> With "ping" you mean an actual network ping, or some operation on top of
> HBase?
>
>
> -- Lars
>
>
>
> ________________________________
>  From: Kristoffer Sjögren <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Sent: Saturday, December 21, 2013 11:17 AM
> Subject: Performance tuning
>
>
> Hi
>
> I have been performance tuning HBase 0.94.6 running Phoenix 2.2.0 the last
> couple of days and need some help.
>
> Background.
>
> - 23 machine cluster, 32 cores, 4GB heap per RS.
> - Table t_24 have 24 online regions (24 salt buckets).
> - Table t_96 have 96 online regions (96 salt buckets).
> - 10.5 million rows per table.
> - Count query - select (*) from ...
> - Group by query - select A, B, C sum(D) from ... where (A = 1 and T >= 0
> and T <= 2147482800) group by A, B, C;
>
> What I found ultimately is that region servers 19, 20, 21, 22 and 23
> are consistently
> 2-3x slower than the others. This hurts overall latency pretty bad since
> queries are executed in parallel on the RS and then aggregated at the
> client (through Phoenix). In Hannibal regions spread out evenly over region
> servers, according to salt buckets (phoenix feature, pre-create regions and
> a rowkey prefix).
>
> As far as I can tell, there is no network or hardware configuration
> divergence between the machines. No CPU, network or other notable
> divergence
> in Ganglia. No RS metric differences in HBase master console.
>
> The only thing that may be of interest is that pings (within the cluster)
> to
> bad RS is about 2-3x slower, around 0.050ms vs 0.130ms. Not sure if
> this is significant,
> but I get a bad feeling about it since it match exactly with the RS that
> stood out in my performance tests.
>
> Any ideas of how I might find the source of this problem?
>
> Cheers,
> -Kristoffer
>
+
Kristoffer Sjögren 2013-12-21, 22:51
+
Kristoffer Sjögren 2013-12-21, 23:00
+
Kristoffer Sjögren 2013-12-21, 23:17
+
lars hofhansl 2013-12-22, 04:45
+
Asaf Mesika 2013-12-28, 20:40
+
Jean-Marc Spaggiari 2013-12-29, 06:42
+
James Taylor 2013-12-21, 23:30
+
Vladimir Rodionov 2013-12-22, 00:07
+
Kristoffer Sjögren 2013-12-21, 21:36
+
Pradeep Gollakota 2013-12-21, 19:37
+
Kristoffer Sjögren 2013-12-21, 19:52
+
Pradeep Gollakota 2013-12-21, 20:40