Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Running Continuous Ingest on small cluster

Copy link to this message
Running Continuous Ingest on small cluster
In playing around with the continuous ingest collection of code (ingest,
walkers, batchwalkers, scanners and agitators), I found myself blindly
guessing at how many of each of these processes I should use.

Are there some generic thoughts as to what might be an ideal saturation
point for N tservers?

I initially split my hosts 4 ways and ran (N/4) of each process (ingest,
walkers, batchwalkers, and scanners), ratcheting down the number of
threads ingest and batchwalkers (to avoid saturating CPU and memory).
Should I try to balance (query threads * query clients) + (ingest
threads * ingest clients) against the available threads per host and
adjust the BatchWriter send buffers similarly in regard to memory available?

I appreciate anyone's insight.

- Josh