Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Save the bandwidth usage

Copy link to this message
Re: 答复: Save the bandwidth usage
Jia Wang 2013-11-14, 10:15
Yes, the SNAPPY compression has been enabled already, which i don't think
help too much cause we are generating random characters.

The duplication factor is 3 by default in Hadoop, we have a 4 servers
cluster, three of them shared with RegionServer, I have disable auto split
policy for my table, currently there are 16 REGION for my table which is
row key start from "0~9" and "A~F", and they get requests quite even.

The disk i/O and CPU usage is quite OK, no more than 50%, memory is also
OK, cause the test is only for inserting data(300M), the HEAP size is set
as 4096MB for each of my Region server.

Any ideas? My thought is that the network bandwidth is taken by data
duplication and other HBase/Hadoop regular sync up operations, if what i
have is a normal case, then fine, i will finish the tuning.

On Thu, Nov 14, 2013 at 5:08 PM, 谢良 <[EMAIL PROTECTED]> wrote:

> Have you enabled compression? could you show us more metric info about
> data locality?
> and maybe there're lots of running compaction activities?
> and could you tell us read/write countPerSecond and estimated kv size?
> ...
> It needs more detail info, i think:)
> Thanks,
> Liang
> ________________________________________
> 发件人: Jia Wang [[EMAIL PROTECTED]]
> 发送时间: 2013年11月14日 16:56
> 主题: Save the bandwidth usage
> Hi Folks
> We are tuning a HBase cluster, it seems the current limitation is on
> network bandwidth usage during a performance test, the bidirectional
> bandwidth usage(sending+receiving) between our nodes is around 1Gb and
> almost hit the limitation(we had a pure network test before), so any ideas
> on how we can improve this?
> Thanks
> Ramon