Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # general - HTTP transport?


Copy link to this message
-
Re: HTTP transport?
Patrick Hunt 2009-11-12, 22:22
One additional benefit of using HTTP is that people are always working
to improve performance, and not only optimizing servers -- Google's SPDY:

http://www.readwriteweb.com/archives/spdy_google_wants_to_speed_up_the_web.php

Multiplexed requests, compressed headers, etc...

Patrick

Doug Cutting wrote:
> I'm considering an HTTP-based transport for Avro as the preferred,
> high-performance option.
>
> HTTP has lots of advantages.  In particular, it already has
>  - lots of authentication, authorization and encryption support;
>  - highly optimized servers;
>  - monitoring, logging, etc.
>
> Tomcat and other servlet containers support async NIO, where a thread is
> not required per connection.  A servlet can process bulk data with a
> single copy to and from the socket (bypassing stream buffers).  Calls
> can be multiplexed over a single HTTP connection using Comet events.
>
> http://tomcat.apache.org/tomcat-6.0-doc/aio.html
>
> Zero copy is not an option for servlets that generate arbitrary data,
> but one can specify a file/start/length tuple and Tomcat will use
> sendfile to write the response.  That means that while HDFS datanode
> file reads could not be done via RPC, they could be done via HTTP with
> zero-copy.  If authentication and authorization are already done in the
> HTTP server, this may not be a big loss.  The HDFS client might make two
> HTTP requests, one to read a files data, and another to read its
> checksums.  The server would then stream the entire block to the client
> using sendfile, using TCP flow control as today.
>
> Thoughts?
>
> Doug