Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> BatchWriter performance on 1.4


Copy link to this message
-
Re: BatchWriter performance on 1.4
On Thu, Sep 19, 2013 at 5:08 PM, Slater, David M.
<[EMAIL PROTECTED]>wrote:

> Thanks Keith, I’m looking at it now. It appears like what I would want. As
> for the proper usage…****
>
> ** **
>
> Would I create one using the Connector, ****
>
> then .getBatchWriter() for each of the tables I’m interested in,****
>
> add data to each of BatchWriters returned,
>

yes.
> ****
>
> and then hit flush() when I want to write all of that to get written?
>

Why are you calling flush() ?   Doing this frequently will increase rpc
overhead and lower throughput.
> ****
>
> ** **
>
> Would the individual batch writers spawned by the multiTableBatchWriter
> still have synchronized addMutations() methods so I would have to worry
> about blocking still, or would that all happen at the flush() method?
>

The returned batch writers are thread safe. They all add to the same
queue/buffer in a synchronized manner.   Calling flush() on any of the
batch writers returned from getBatchWriter() will block the others.

If you enable set the log4j log level to TRACE for
org.apache.accumulo.client.impl you can see output like the following.
 Binning is the process of taking each mutation and deciding which tablet
and tablet server it goes to.

  2013-09-19 18:43:37,261 [impl.ThriftTransportPool] TRACE: Using existing
connection to 127.0.0.1:9997
  2013-09-19 18:43:37,393 [impl.TabletLocatorImpl] TRACE: tid=12 oid=13
 Binning 80909 mutations for table 3
  2013-09-19 18:43:37,402 [impl.TabletLocatorImpl] TRACE: tid=12 oid=13
 Binned 80909 mutations for table 3 to 1 tservers in 0.009 secs
  2013-09-19 18:43:37,402 [impl.TabletServerBatchWriter] TRACE: Started
sending 80,909 mutations to 1 tablet servers
  2013-09-19 18:43:37,656 [impl.ThriftTransportPool] TRACE: Returned
connection 127.0.0.1:9997 (120000) ioCount : 1459116
  2013-09-19 18:43:37,657 [impl.TabletServerBatchWriter] TRACE: sent 80,909
mutations to 127.0.0.1:9997 in 0.40 secs (204,832.91 mutations/sec) with 0
failures

When you close the batch writer, it will log some summary stats like the
following.
  2013-09-19 18:43:39,149 [impl.TabletServerBatchWriter] TRACE:
  2013-09-19 18:43:39,149 [impl.TabletServerBatchWriter] TRACE: TABLET
SERVER BATCH WRITER STATISTICS
  2013-09-19 18:43:39,149 [impl.TabletServerBatchWriter] TRACE: Added
         :  1,000,000 mutations
  2013-09-19 18:43:39,149 [impl.TabletServerBatchWriter] TRACE: Sent
          :  1,000,000 mutations
  2013-09-19 18:43:39,149 [impl.TabletServerBatchWriter] TRACE: Resent
percentage   :       0.00%
  2013-09-19 18:43:39,150 [impl.TabletServerBatchWriter] TRACE: Overall
time         :       5.94 secs
  2013-09-19 18:43:39,150 [impl.TabletServerBatchWriter] TRACE: Overall
send rate    : 168,406.87 mutations/sec
  2013-09-19 18:43:39,150 [impl.TabletServerBatchWriter] TRACE: Send
efficiency      :      86.91%
  2013-09-19 18:43:39,150 [impl.TabletServerBatchWriter] TRACE:
  2013-09-19 18:43:39,150 [impl.TabletServerBatchWriter] TRACE: BACKGROUND
WRITER PROCESS STATISTICS
  2013-09-19 18:43:39,150 [impl.TabletServerBatchWriter] TRACE: Total send
time      :       5.16 secs  86.91%
  2013-09-19 18:43:39,150 [impl.TabletServerBatchWriter] TRACE: Average
send rate    : 193,760.90 mutations/sec
  2013-09-19 18:43:39,151 [impl.TabletServerBatchWriter] TRACE: Total bin
time       :       0.46 secs   7.81%
  2013-09-19 18:43:39,151 [impl.TabletServerBatchWriter] TRACE: Average bin
rate     : 2,155,172.41 mutations/sec
  2013-09-19 18:43:39,151 [impl.TabletServerBatchWriter] TRACE: tservers
per batch   :     1.00 avg       1 min      1 max
  2013-09-19 18:43:39,151 [impl.TabletServerBatchWriter] TRACE: tablets per
batch    :     1.00 avg       1 min      1 max
  2013-09-19 18:43:39,151 [impl.TabletServerBatchWriter] TRACE:
  2013-09-19 18:43:39,151 [impl.TabletServerBatchWriter] TRACE: SYSTEM
STATISTICS
  2013-09-19 18:43:39,151 [impl.TabletServerBatchWriter] TRACE: JVM GC Time
         :       0.53 secs
  2013-09-19 18:43:39,152 [impl.TabletServerBatchWriter] TRACE: JVM Compile
Time     :       1.60 secs
  2013-09-19 18:43:39,152 [impl.TabletServerBatchWriter] TRACE: System load
average : initial=  0.22 final=  0.20

What do these numbers look like for you?

Keith

****