Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> About "dfs.client-write-packet-size" setting


Copy link to this message
-
About "dfs.client-write-packet-size" setting
The default dfs.client-write-packet-size value is 64k, at least it's in my Hadoop2 env.
I did a benchmark about i via ycsb loading 2 million records(3*200 bytes):
1) dfs.client-write-packet-size=64k ygc count:399, ygct:4.208s
2) dfs.client-write-packet-size=8k ygc count:163, ygct:2.644s
you see, it's about 40% benefit on gct:)
It's because: in DFSOutputStream.Packet class, each "Create a new packet" operation,
will call "buf = new byte[PacketHeader.PKT_MAX_HEADER_LEN + pktSize];",
here "pktSize" comes from dfs.client-write-packet-size setting, and in HBase write scenario,
we sync WAL asap, so all the new packets are very small
(in my ycsb testing, most of them were only hundreds of bytes, or a few kilo bytes),
rarely reached to 64k, so always allocating 64k array is just a waste.
It would be better that if we add it to refguide note:)

ps; 8k just a test setting, we should set it according the real kv size pattern.

Thanks,

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB