Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Avro enhancement: asynchronous RPCs for Java clients


Copy link to this message
-
Re: Avro enhancement: asynchronous RPCs for Java clients
I just finished a second attempt at the asynchronous RPC implementation
incorporating Philip's feedback and some other ideas that I had.  I think
it's easiest to explain how it works with an example.  So here's a simple
IDL and schema:

IDL:
protocol Calculator {
  int add(int arg1, int arg2);
}

Schema:
{"protocol":"Calculator","messages":{
  "add":{
    "request":[{"name":"arg1","type":"int"},{"name":"arg2","type":"int"}],
    "response":"int"}}}

No changes are required to the IDL or schema to enable async RPCs.  The Avro
Java compiler will generate two interfaces instead of one.  The first
interface, Calculator, contains the standard synchronous methods.  The
second interface, CalculatorClient, extends Calculator and adds asynchronous
methods for all two-way messages.  The reason why the async methods are
separated out into a separate interface is that the responder/server side
doesn't need to know (and shouldn't know) about the client-side async
methods.  So the Responder/server implements Calculator, and the
Requestor/client can either use Calculator or CalculatorClient to invoke the
RPCs.  For reference, here is what the two generated interfaces look like
(without the PROTOCOL field and package names):

public interface Calculator {
  int add(int arg1, int arg2) throws AvroRemoteException;
}
public interface CalculatorClient extends Calculator {
  CallFuture<Integer> addAsync(int arg1, int arg2) throws IOException;
  CallFuture<Integer> addAsync(int arg1, int arg2, Callback<Integer>
callback) throws IOException;
}

The CalculatorClient interface is the only new component.  It has two
methods for each message, one that takes a Callback and one that does not.
Both methods return a CallFuture so that the client has the option of using
either the Future or the Callback to obtain the result of the RPC.
Future.get() blocks until the RPC is complete, and either returns the result
or throws an exception if one occurred during the RPC.  The Callback
interface has two methods, handleResult(T result) and handleError(Exception
error).  One or the other is always called depending on whether the RPC was
successful or an Exception was thrown.

In addition to the compiler changes, I had to make some changes in the
avro-ipc project to get the async plumbing to work correctly.  Most of these
changes are in Requestor and NettyTransceiver.  As part of the changes I had
to make to Requestor I ended up replacing a couple of large synchronized
blocks with finer-grained critical sections protected by reentrant locks.  I
think this change improved performance overall, at least in the case where
multiple threads are using the same client.  I implemented a rudimentary
performance test that spins up a bunch of threads, executes the same RPC
(Simple.hello(String)) repeatedly for a fixed amount of time, and then
calculates the average number of RPCs completed per second.  With Avro 1.5.1
I got 7,450 RPCs/sec, and with my modified version of trunk I got 19,050
RPCs/sec.  That was a very simple test, but if there is a standard benchmark
that the Avro team uses I'd be happy to rerun my tests using that.

So that's basically it.  All existing unit tests pass, and I wrote
additional tests for all the new async functionality.  I've documented all
public interfaces, and I think the changes are ready to be reviewed if any
of the committers have time to take a look.  Should I post a patch
somewhere?  AVRO-539?  ReviewBoard?

-James
On Tue, May 31, 2011 at 9:09 PM, James Baldassari <[EMAIL PROTECTED]>wrote:

> Thanks for the helpful feedback!
>
> After thinking about this more, I agree that it would be cleaner and
> simpler to remove the "async" keyword/property from the IDL and schema.
> Instead I'll just generate the asynchronous companion methods for all
> two-way messages.
>
> Regarding the passing of RPC results and exceptions/errors back to the
> client asynchronously, I'm not sure what the best approach is.  I had
> considered the use of both the future pattern and the callback pattern, but