Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # user >> Increasing Ingest Rate

Jimmy Lin 2013-04-04, 18:01
Copy link to this message
Re: Increasing Ingest Rate
Have you pre-split your tablet to spread the load out to all the machines?

Does the data distribution match your splits?
Is the ingest data already sorted (that is, it always writes to the last

How much memory and how many threads are you using in your batchwriters?

Check the ingest rates on tablet server monitor page and look for hot spots.

On Thu, Apr 4, 2013 at 2:01 PM, Jimmy Lin <[EMAIL PROTECTED]> wrote:

> Hello,
> I am fairly new to Accumulo and am trying to figure out what is preventing
> my system from ingesting data at a faster rate. We have 15 nodes running a
> simple Java program that reads and writes to Accumulo and then indexes some
> data into Solr. The rate of ingest is not scaling linearly with the number
> of nodes that we start up. I have tried increasing several parameters
> including:
>  - limit of file descriptors in linux
> - max zookeeper connections
> - tserver.memory.maps.max
> - tserver_opts memory size
> - tserver.mutation_queue.max
> - tserver.scan.files.open.max
> - tserver.walog.max.size
> - tserver.cache.data.size
> - tserver.cache.index.size
> - hdfs setting for xceivers
> No matter what changes we make, we cannot get the ingest rate to go over
> 100k entries/s and about 6 Mb/s. I know Accumulo should be able to ingest
> faster than this.
>  Thanks in advance,
> Jimmy Lin
Jimmy Lin 2013-04-04, 19:26
Aaron Cordova 2013-04-04, 21:22
Eric Newton 2013-04-08, 13:31