Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # dev >> Thoughts on an RPC protocol


Copy link to this message
-
Re: Thoughts on an RPC protocol
Hi Bo,

Thanks for your feedback!

On Thu, Apr 8, 2010 at 3:49 PM, Bo Shi <[EMAIL PROTECTED]> wrote:

> Hi Bruce,
>
> Would this RPC protocol take the role of the transport in the Avro
> specification or would it replace the protocol?  If the handshake
> occurs on channel 0 while the request/response payloads are
> transferred on a different channel, this would not meet the existing
> wire protocol as described in the current 1.3.2 spec right?
>

This would be a transport.

I was thinking they would happen on 0 with 0 being a control channel, but
I'm not married to that. Offhand, I don't see why that would violate
anything in the specification though?

A couple other questions inline:
>
> On Thu, Apr 8, 2010 at 11:54 AM, Bruce Mitchener
> <[EMAIL PROTECTED]> wrote:
> > While I recommend actually reading RFC 3080 (it is an easy read), this
> > summary may help...
> >
> > Framing: Length prefixed data, nothing unusual.
> > Encoding: Messages are effectively this:
> >
> > enum message_type {
> >    message,            // a request
> >    reply,                  // when there's only a single reply
> >    answer,               // when there are multiple replies, send
> multiple
> > answers and then a null.
> >    null,                    // terminate a chain of replies
> >    error,                  // oops, there was an error
> > }
> >
> > struct message {
> >    enum message_type message_type;
> >    int channel;
> >    int message_id;
> >    bool more;          // Is this message complete, or is more data
> coming?
> > for streaming
> >    int sequence_number; // see RFC 3080
> >    optional int answer_number; // Used for answers
> >    bytes payload;   // The actual RPC command, still serialized here
> > }
> >
> > When a connection is opened, there's initially one channel, channel 0.
> That
> > channel is used for commands controlling the connection state, like
> opening
> > and closing channels.  We should also perform Avro RPC handshakes over
> > channel 0.
>
> Is channel 0 used exclusively as a control channel or would requests
> be allowed on this channel?  Any idea on what the control messages
> would look like?
Channel 0 would be a control channel in my original proposal. I can see
arguments for allowing sending other requests on it from a simplicity point
of view. Thoughts?

Control messages would look like an Avro RPC call. They would be things
like:

OpenChannel returning the opened channel (or you could pass it the channel
number?)
CloseChannel gets passed the channel number to close.

> Channels allow for concurrency.  You can send requests/messages down
> > multiple channels and process them independently. Messages on a single
> > channel need to be processed in order though. This allows for both
> > guaranteed order of execution (within a single channel) and greater
> > concurrency (multiple channels).
> >
> > Streaming happens in 2 ways.
>
> For streaming transfers, thoughts on optional compression codec
> attachment to streaming channels?  It may be useful for IO-bound
> applications, but if you're transferring files like avro object
> container files that are already compressed - you'd need some extra
> coordination (but maybe that's outside the problem domain).

Compression would probably be something useful to support, agreed. That
could happen in various ways:
   - Per channel (all data on the channel is compressed)
   - Per request via a header of some sort
   - Per connection (all data on all channels)
I suspect that I prefer per-channel from a simplicity point of view, but I'd
like to hear what people think.

 - Bruce
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB