-Re: Increasing Ingest Rate
Eric Newton 2013-04-08, 13:31
Hopefully you are using accumulo 1.4.*3*.
A performance issue (ACCUMULO-1062) was found in 1.4.2 when a large number
of clients attempted to update a tablet concurrently.
On Thu, Apr 4, 2013 at 3:26 PM, Jimmy Lin <[EMAIL PROTECTED]> wrote:
> On Thu, Apr 4, 2013 at 2:25 PM, Eric Newton <[EMAIL PROTECTED]> wrote:
>> Have you pre-split your tablet to spread the load out to all the
>> Yes. We are using splits from loading the whole dataset previously.
>> Does the data distribution match your splits?
>> Yes. See above.
>> Is the ingest data already sorted (that is, it always writes to the last
>> No. The data writes to multiple tablets concurrently. We set up a queue
>> parameter and divide the data into multiple queues.
>> How much memory and how many threads are you using in your batchwriters?
>> I believe we have 16GB of memory for the Java writer with 18 threads
>> running per server.
>> Check the ingest rates on tablet server monitor page and look for hot
>> There are certain servers that have higher ingest rates, and the server
>> that is busiest changes over time, but the overall ingestion rate will not
>> go up.
>> On Thu, Apr 4, 2013 at 2:01 PM, Jimmy Lin <[EMAIL PROTECTED]> wrote:
>>> I am fairly new to Accumulo and am trying to figure out what is
>>> preventing my system from ingesting data at a faster rate. We have 15 nodes
>>> running a simple Java program that reads and writes to Accumulo and then
>>> indexes some data into Solr. The rate of ingest is not scaling linearly
>>> with the number of nodes that we start up. I have tried increasing several
>>> parameters including:
>>> - limit of file descriptors in linux
>>> - max zookeeper connections
>>> - tserver.memory.maps.max
>>> - tserver_opts memory size
>>> - tserver.mutation_queue.max
>>> - tserver.scan.files.open.max
>>> - tserver.walog.max.size
>>> - tserver.cache.data.size
>>> - tserver.cache.index.size
>>> - hdfs setting for xceivers
>>> No matter what changes we make, we cannot get the ingest rate to go over
>>> 100k entries/s and about 6 Mb/s. I know Accumulo should be able to ingest
>>> faster than this.
>>> Thanks in advance,
>>> Jimmy Lin