My "dream" is to get to your point. I was always stopped before reaching
network limit. My premise was the WAL synchronize was the key bottleneck in
How much data are you inserting? How much client threads? Batch size in
Share some more info on your cluster and test setup.
On Thursday, November 14, 2013, Jia Wang wrote:
> Yes, the SNAPPY compression has been enabled already, which i don't think
> help too much cause we are generating random characters.
> The duplication factor is 3 by default in Hadoop, we have a 4 servers
> cluster, three of them shared with RegionServer, I have disable auto split
> policy for my table, currently there are 16 REGION for my table which is
> row key start from "0~9" and "A~F", and they get requests quite even.
> The disk i/O and CPU usage is quite OK, no more than 50%, memory is also
> OK, cause the test is only for inserting data(300M), the HEAP size is set
> as 4096MB for each of my Region server.
> Any ideas? My thought is that the network bandwidth is taken by data
> duplication and other HBase/Hadoop regular sync up operations, if what i
> have is a normal case, then fine, i will finish the tuning.
> > Have you enabled compression? could you show us more metric info about
> > data locality?
> > and maybe there're lots of running compaction activities?
> > and could you tell us read/write countPerSecond and estimated kv size?
> > ...
> > It needs more detail info, i think:)
> > Thanks,
> > Liang
> > ________________________________________
> > 发送时间: 2013年11月14日 16:56
> > 主题: Save the bandwidth usage
> > Hi Folks
> > We are tuning a HBase cluster, it seems the current limitation is on
> > network bandwidth usage during a performance test, the bidirectional
> > bandwidth usage(sending+receiving) between our nodes is around 1Gb and
> > almost hit the limitation(we had a pure network test before), so any
> > on how we can improve this?
> > Thanks
> > Ramon