|
|
Aaron Cordova 2012-03-14, 17:00
What are the current limitations on storing large values in Accumulo? Specifically, what would break first and at how big would a value have to be to cause the break? Also, is storing lots of large values much worse than storing occasional large values?
I can of course test this but it'd be nice to hear what is already known about the subject first.
Aaron
Eric Newton 2012-03-14, 17:22
It depends on how large loggers' JVM can get. We typically run them with less memory than the tablet servers, so we see the loggers break when the large mutation hits the logger.
If your logger has a large JVM, then your tablet server will crash receiving the message, or serializing it to the loggers, and then when it tries to recover after crashing, which will slowly take down your whole cluster.
However, if a large value is bulk loaded, you will see the tablet servers crash when they do a major compaction.
The upper limit is something like 1/3 of the JVM memory size, but in practice, if you have a lot of large values, this limit will be much lower.
-Eric
On Wed, Mar 14, 2012 at 1:00 PM, Aaron Cordova <[EMAIL PROTECTED]> wrote:
> What are the current limitations on storing large values in Accumulo? > Specifically, what would break first and at how big would a value have to > be to cause the break? Also, is storing lots of large values much worse > than storing occasional large values? > > I can of course test this but it'd be nice to hear what is already known > about the subject first. > > Aaron
Keith Turner 2012-03-14, 22:01
Concurrency is another consideration. There could be multiple large values in flight at any given time. For example multiple clients could be writing mutations with large values, while minor compactions are writing out large values, and major compactions reading and writing large values. In this case all of these threads running concurrently could exhaust memory when no single thread would. The probability of this happening at the same time increases as more large values are written.
Keith
On Wed, Mar 14, 2012 at 1:22 PM, Eric Newton <[EMAIL PROTECTED]> wrote: > It depends on how large loggers' JVM can get. We typically run them with > less memory than the tablet servers, so we see the loggers break when the > large mutation hits the logger. > > If your logger has a large JVM, then your tablet server will crash receiving > the message, or serializing it to the loggers, and then when it tries to > recover after crashing, which will slowly take down your whole cluster. > > However, if a large value is bulk loaded, you will see the tablet servers > crash when they do a major compaction. > > The upper limit is something like 1/3 of the JVM memory size, but in > practice, if you have a lot of large values, this limit will be much lower. > > -Eric > > > On Wed, Mar 14, 2012 at 1:00 PM, Aaron Cordova <[EMAIL PROTECTED]> wrote: >> >> What are the current limitations on storing large values in Accumulo? >> Specifically, what would break first and at how big would a value have to be >> to cause the break? Also, is storing lots of large values much worse than >> storing occasional large values? >> >> I can of course test this but it'd be nice to hear what is already known >> about the subject first. >> >> Aaron > >
|
|