Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # dev >> Kafka REST interface


+
David Arthur 2012-08-03, 14:41
+
Jonathan Creasy 2012-08-03, 20:13
+
David Arthur 2012-08-06, 12:39
+
Jonathan Creasy 2012-08-06, 18:19
+
David Arthur 2012-08-10, 16:54
+
Jay Kreps 2012-08-10, 19:50
+
David Arthur 2012-08-11, 03:15
+
Taylor Gautier 2012-08-12, 14:39
+
David Arthur 2012-08-24, 16:37
+
David Arthur 2012-09-10, 13:49
Copy link to this message
-
Re: Kafka REST interface
Another bump for this thread...

For those just joining, this prototype is a simple HTTP server that proxies the complex consumer code through two HTTP endpoints.

https://github.com/mumrah/kafka/blob/rest/contrib/rest-proxy/src/main/scala/RESTServer.scala

E.g.,
  
    curl http://localhost:8888/my-topic -X POST -d 'Here is a message'

and

    curl http://localhost:8888/my-topic/my-group -X GET
This is not an attempt to expose the FetchRequest/ProduceRequest protocol over HTTP.

Few questions:

* Would including offsets be useful here? Since it is utilizing the ZK-backed consumer code, I would think not
* I have chosen to create one thread per topic+group (mostly for simplicity sake). Multiple REST servers could be run and load balanced across to increase the consumer parallelism. Maybe it would make sense for an individual REST server to create more than one thread per topic+group?

Cheers
-David

On Sep 10, 2012, at 9:49 AM, David Arthur wrote:

> Bump.
>
> Anyone have feedback on this approach?
>
> -David
>
> On Aug 24, 2012, at 12:37 PM, David Arthur wrote:
>
>> Here is an initial pass at a Kafka REST proxy (in Scala)
>>
>> https://github.com/mumrah/kafka/blob/rest/contrib/rest-proxy/src/main/scala/RESTServer.scala
>>
>> The basic gist is:
>> * Jetty for webserver
>> * Messages are strings
>> * GET /topic/group to get a message (timeout after 1s)
>> * POST /topic, the request body is the message
>> * One consumer thread per topic+group
>>
>> Be wary, many things are hard coded at this point (port numbers, etc). Obviously, this will need to change. Also, I haven't the slightest idea how to setup/use sbt properly, so I just checked in the libs.
>>
>> Feedback is welcome in this thread or on Github.  Be gentle please, this is my first go at Scala
>>
>> -David
>>
>> On Aug 12, 2012, at 10:39 AM, Taylor Gautier wrote:
>>
>>> Jay I agree with you 100%.
>>>
>>> At Tagged we have implemented a proxy for various internal reasons (
>>> primarily to act as a high performance relay from PHP to Kafka). It's
>>> implemented in Node.js (JavaScript)
>>>
>>> Currently it services UDP packets encoded in binary but it could
>>> easily be modified to accept http also since Node support for http is
>>> pretty simple.
>>>
>>> If others are interested in maintaining something like this we could
>>> consider adding this to the public domain along side the already
>>> existing Node.js client implementation.
>>>
>>>
>>>
>>> On Aug 10, 2012, at 3:51 PM, Jay Kreps <[EMAIL PROTECTED]> wrote:
>>>
>>>> My personal preference would be to have only a single protocol in kafka
>>>> core. I have been down the multiple protocol route and my experience was
>>>> that it adds a lot of burden for each change that needs to be made and a
>>>> lot of complexity to abstract over the different protocols. From the point
>>>> of view of a user they are generally a bit agnostic as to how bytes are
>>>> sent back and forth provided it is reliable and easily implementable in any
>>>> language. Generally they care more about the quality of the client in their
>>>> language of choice.
>>>>
>>>> My belief is that the main benefit of REST is ease of implementing a
>>>> client. But currently the biggest barrier is really the use of zk and
>>>> fairly thick consumer design. So I think the current thinking is that we
>>>> should focus on thinning that out and removing the client-side zk
>>>> dependency. I actually don't think TCP is a huge burden if the protocol is
>>>> simple, and there are actually some advantages (for example the consumer
>>>> needs to consume from multiple servers so select/poll/epoll is natural but
>>>> this is not always available from HTTP client libraries).
>>>>
>>>> Basically this is an area where I think it is best to pick one way and
>>>> really make it really bullet proof rather than providing lots of options.
>>>> In some sense each option tends to increase the complexity of testing
>>>> (since now there are many combinations to try) and also of implementation
+
David Arthur 2012-11-20, 22:06
+
Taylor Gautier 2012-11-21, 15:54
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB