Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # dev >> RFC: Behavior of QuotaExceededException

Copy link to this message
Re: RFC: Behavior of QuotaExceededException
This is fundamentally one of the huge problems with running "ZooKeeper as a
service", I would be very interested to see how you get around it. ZK is so
sensitive to client (ill)behavior, as I'm sure you're experiencing, that
it's really difficult to provide it as a service.
Do you guys control the ZK clients? One thing that I did to get around
problems like this was actually wrapping the client library into my own,
more restrictive client. There's not a great way to ensure that the only
people connecting to your ZK are actually running your client (we need an
API key or something) but if you're actually wrapping the way people work
with ZK you can solve a reasonable subset of problems. In fact, when I was
running a "ZooKeeper as a Service" centralized system, the only clients
that ever caused me problems were from a group of perl developers that
weren't using a client my team had provided and thus totally misused the
system. This would at least make your second two problems (requests/sec and
update bytes/sec) less likely to happen.

For the issue of node count and used bytes, those are currently set on
subtrees aren't they? So it doesn't just affect the client connected but
any client using that subtree? I don't know how anyone can sanely reason
about a size/byte limit on a node/subtree that someone else is crapping all
over, unless it's tied to ACLs or something. How are you getting around
On Wed, Feb 27, 2013 at 2:12 PM, Thawan Kooburat <[EMAIL PROTECTED]> wrote:

> On 2/27/13 12:10 AM, "Flavio Junqueira" <[EMAIL PROTECTED]> wrote:
> >It wouldn't be very nice to allow holes in the sequence of operations of
> >a client, it would violate session semantics. I'm also wondering about a
> >couple of things:
> >
> >- What does QuotaExceedException convey to the application? That the
> >application client won't ever be able to send operations again with that
> >session? That it won't be able to submit new operations for up to x
> >amount of time, where x is computed somehow? Expiring the session will
> >have the side-effect that all the ephemeral nodes will be gone, I'm not
> >sure that's desirable, but as a punishment it might work out fine. ;-)
> My initial plan is support 4 types of hard limits ( node count, used
> bytes, requests/sec and update bytes/sec).  For the first 2 types of
> limits, it is likely that client won't be able to complete any operation
> after quota is exceeded. For the last two, after some amount of time, the
> client should be able to make a successful request.
> >- Have you consider limiting the rate of client operations instead of
> >failing operations? Shaping the traffic of operations of a client might
> >be way nicer from the client perspective, but perhaps a bit harder to
> >implement.
> We considered that as well, I already prototyped this feature a while
> back. The main problem that I saw is that the network layer (eg. NIO
> subsystem) only know about request size/rate ,client's ip/port and
> sessionId. So its ability to do throttling is limited. Additionally, for a
> client with a low session timeout, it will eventually timeout and
> reconnect with other server (or create a new session) which will allow it
> make a successful request on the other server until it exceeds usage
> threshold again.
> Thanks for your response. I think I will go with session expire route.
> >
> >-Flavio
> >
> >On Feb 27, 2013, at 1:41 AM, Thawan Kooburat <[EMAIL PROTECTED]> wrote:
> >
> >> Hi,
> >> I am currently working on ZOOKEEPER-1383. One of the main feature
> >>introduced in this change is to allow ZooKeeper to enforce hard limit
> >>(e.g.  Txn per sec) per folder .
> >>
> >> With hard limit, we need to introduce a new exception/error code
> >>(QuotaExceeded) for ZooKeeper operations that modify the DataTree.  If a
> >>client get this error, it means that the particular operation is
> >>definitely failed.
> >>
> >> From our internal discussion, this may make it harder for a user to
> >>write an application.  The thought is that this can possibly introduce a