The verification job is very sensitive to the number of rounds it
takes to shuffle/sort the results. How many reducers have you used,
and how much memory have you given them? More is better.
I think we've clocked the verification job for 24 hours of ingest in
under 2 hours. This is from memory, so I could be wrong. But with a
bad configuration (uses only a few small reducers), it can take a very
Go with as many as 100 reducers per node and let the reducers have a
lot of memory. You want each reducer to run long enough to make the
process creation overhead small. So they should run for a few
Please post back with any improvements!
We are about to enter a testing cycle, so I'll update the example
configuration files with some better instructions.
I'm curious, how many key/value entries did you ingest in 24 hours?
On Tue, Oct 22, 2013 at 4:56 PM, Billie Rinaldi
<[EMAIL PROTECTED]> wrote:
> I believe it does take a long time to verify. Shorter than, but a similar
> order of magnitude as, the amount of time it took to write the data.
> Others may be able to give you more quantitative information.
> On Tue, Oct 22, 2013 at 12:56 PM, Ryan Fishel <[EMAIL PROTECTED]>wrote:
>> I am currently running through the test suites included with the Accumulo
>> package ($ACCUMULO_HOME/test/system) and am running into some rather long
>> verification times with the Continuous Test.
>> I am running the continuous test for a 24 hour period on a 7 node cluster
>> with walkers, batch walkers, and that stats service turned on. All jobs
>> appear to run fine during the whole period. Since the test docs don't give
>> any indication, I was wondering if someone could provide typical run times
>> for the verification job? I'd like to appropriately set my expectations
>> before I start looking for a misconfiguration in the underlying cluster.
>> Thank you!
>> Ryan Fishel