-0.8 wire protocol for inter-broker communication
Jun Rao 2013-01-29, 16:33
In 0.8, we added versionId for each type of requests. The plan is that if
we want to evolve a particular request, we can implement the logic in the
broker to support both the old and the new versions. Then, we can upgrade
the server first, followed by the clients.
However, this approach doesn't quite work for requests used among brokers.
These include all requests sent by the controller (e.g.,
LeaderAndIsrRequest) and FetchRequest (used by replica fetchers). If we
want to evolve those requests, we will have to bring down the whole cluster
to do the upgrade (since each broker is both a client and a server). This
of course will make the cluster unavailable.
So, we need to think about a couple of things. First, what's our strategy
to evolve those inter-broker requests. One thing that I can think of is to
do the upgrade in two passes. In the first pass, we upgrade all brokers
first so that each of them is capable of receiving the new version, but not
able to send the new version (this can be controlled by a config). In the
second pass, we upgrade all brokers again by allowing them to send the new
version. Not sure if this is the best way since this will make upgrade a
bit more complicated.
Second, we probably need to make another pass of those requests to make
sure that they are in good shape, since any change in the future may not be
easy. For example, in LeaderAndIsr response, should we remove the global
errorcode since we already have an errorcode per partition? Also, for the
FetchRequest used by replica fetcher, currently we assume that the fetch
offset equals to the logEndOffset of the remote replica. If we want to
pipeline those requests, this may not be true. So, we will need a separate
field to represent logEndOffset.