|
|
David Arthur 2012-08-03, 14:41
I'd like to tackle this project (assuming it hasn't been started yet). I wrote up some initial thoughts here: https://gist.github.com/3248179TLDR; use Range header for specifying offsets, simple URIs like /kafka/topics/[topic]/[partition], use for a simple transport of bytes and/or represent the messages as some media type (text, json, xml) Feedback is most welcome (in the Gist or in this thread). Cheers! -David
+
David Arthur 2012-08-03, 14:41
-
Re: Kafka REST interface
Jonathan Creasy 2012-08-03, 20:13
I have an internal one working and was hoping to have it open sourced in the next week. The one at Box is based on the CodeIgniter framework, we have about 45 RESTful interfaces built on this framework so I just put together another one for Kafka. Here are my notes, these were pre-dev so may be a little different than what we ended up with. https://cwiki.apache.org/confluence/display/KAFKA/Restful+API+ProposalI will read yours later this afternoon, we should work together. -Jonathan On Fri, Aug 3, 2012 at 7:41 AM, David Arthur <[EMAIL PROTECTED]> wrote: > I'd like to tackle this project (assuming it hasn't been started yet). > > I wrote up some initial thoughts here: https://gist.github.com/3248179> > TLDR; use Range header for specifying offsets, simple URIs like > /kafka/topics/[topic]/[partition], use for a simple transport of bytes > and/or represent the messages as some media type (text, json, xml) > > Feedback is most welcome (in the Gist or in this thread). > > Cheers! > > -David
+
Jonathan Creasy 2012-08-03, 20:13
-
Re: Kafka REST interface
David Arthur 2012-08-06, 12:39
I'd be happy to collaborate on this, though it's been a while since I've used PHP. From what it looks like, what you have is a true proxy that runs outside of Kafka and translates some REST routes into Kafka client calls. This sounds more in line with what the project page describes. What I have proposed is more like a translation layer between some REST routes and FetchRequests. In this case the client is responsible for managing offsets. Using the consumer groups and ZooKeeper would be another nice way of consuming messages (which is probably more like what you have). Any maintainers have feedback on this? On Aug 3, 2012, at 4:13 PM, Jonathan Creasy wrote: > I have an internal one working and was hoping to have it open sourced in > the next week. The one at Box is based on the CodeIgniter framework, we > have about 45 RESTful interfaces built on this framework so I just put > together another one for Kafka. > > > Here are my notes, these were pre-dev so may be a little different than > what we ended up with. > > https://cwiki.apache.org/confluence/display/KAFKA/Restful+API+Proposal> > I will read yours later this afternoon, we should work together. > > -Jonathan > > > On Fri, Aug 3, 2012 at 7:41 AM, David Arthur <[EMAIL PROTECTED]> wrote: > >> I'd like to tackle this project (assuming it hasn't been started yet). >> >> I wrote up some initial thoughts here: https://gist.github.com/3248179>> >> TLDR; use Range header for specifying offsets, simple URIs like >> /kafka/topics/[topic]/[partition], use for a simple transport of bytes >> and/or represent the messages as some media type (text, json, xml) >> >> Feedback is most welcome (in the Gist or in this thread). >> >> Cheers! >> >> -David
+
David Arthur 2012-08-06, 12:39
-
Re: Kafka REST interface
Jonathan Creasy 2012-08-06, 18:19
That is correct, a Consume REST request would be handed off to a Consumer class which communicates with ZooKeeper and Kafka. The proxy keeps track of clients, there are a few routes to manipulate the offsets if you wanted to force a reset, seek to the end, or re-retrieve some data from a topic. If others agree, I would be happy to collaborate on integrating a REST interface to Kakfa directly, more as your have proposed, rather than a proxy. The proxy was low-hanging fruit within Box because we already had a framework for cranking out a REST API quite easily and a set of Kafka+ZK classes. It was simply stitching the two together. -Jonathan On Mon, Aug 6, 2012 at 5:39 AM, David Arthur <[EMAIL PROTECTED]> wrote: > I'd be happy to collaborate on this, though it's been a while since I've > used PHP. > > From what it looks like, what you have is a true proxy that runs outside > of Kafka and translates some REST routes into Kafka client calls. This > sounds more in line with what the project page describes. What I have > proposed is more like a translation layer between some REST routes and > FetchRequests. In this case the client is responsible for managing offsets. > Using the consumer groups and ZooKeeper would be another nice way of > consuming messages (which is probably more like what you have). > > Any maintainers have feedback on this? > > On Aug 3, 2012, at 4:13 PM, Jonathan Creasy wrote: > > > I have an internal one working and was hoping to have it open sourced in > > the next week. The one at Box is based on the CodeIgniter framework, we > > have about 45 RESTful interfaces built on this framework so I just put > > together another one for Kafka. > > > > > > Here are my notes, these were pre-dev so may be a little different than > > what we ended up with. > > > > https://cwiki.apache.org/confluence/display/KAFKA/Restful+API+Proposal> > > > I will read yours later this afternoon, we should work together. > > > > -Jonathan > > > > > > On Fri, Aug 3, 2012 at 7:41 AM, David Arthur <[EMAIL PROTECTED]> wrote: > > > >> I'd like to tackle this project (assuming it hasn't been started yet). > >> > >> I wrote up some initial thoughts here: https://gist.github.com/3248179> >> > >> TLDR; use Range header for specifying offsets, simple URIs like > >> /kafka/topics/[topic]/[partition], use for a simple transport of bytes > >> and/or represent the messages as some media type (text, json, xml) > >> > >> Feedback is most welcome (in the Gist or in this thread). > >> > >> Cheers! > >> > >> -David > >
+
Jonathan Creasy 2012-08-06, 18:19
-
Re: Kafka REST interface
David Arthur 2012-08-10, 16:54
In regards to embedding an HTTP server in Kafka to handle REST requests, how hard would it be to modify/extend the existing SocketServer? Seems like lots of good work went into the networking stuff in Kafka, it would make sense to try and leverage that. I could imagine KafkaServer (optionally) starting up an HttpSocketServer with HttpRequestHandlers (similar to the existing SocketServer/RequestHandlers). Is this feasible/sensible? Assuming Kafka handles the socket layer, perhaps something like Apache HTTP Components ( http://hc.apache.org) could be used for parsing the HTTP messages? I'd like to stay away from higher level web frameworks for something simple like this. On Aug 6, 2012, at 2:19 PM, Jonathan Creasy wrote: > That is correct, a Consume REST request would be handed off to a Consumer class which communicates with ZooKeeper and Kafka. The proxy keeps track of clients, there are a few routes to manipulate the offsets if you wanted to force a reset, seek to the end, or re-retrieve some data from a topic. > > If others agree, I would be happy to collaborate on integrating a REST interface to Kakfa directly, more as your have proposed, rather than a proxy. The proxy was low-hanging fruit within Box because we already had a framework for cranking out a REST API quite easily and a set of Kafka+ZK classes. It was simply stitching the two together. > > -Jonathan > > On Mon, Aug 6, 2012 at 5:39 AM, David Arthur <[EMAIL PROTECTED]> wrote: > I'd be happy to collaborate on this, though it's been a while since I've used PHP. > > From what it looks like, what you have is a true proxy that runs outside of Kafka and translates some REST routes into Kafka client calls. This sounds more in line with what the project page describes. What I have proposed is more like a translation layer between some REST routes and FetchRequests. In this case the client is responsible for managing offsets. Using the consumer groups and ZooKeeper would be another nice way of consuming messages (which is probably more like what you have). > > Any maintainers have feedback on this? > > On Aug 3, 2012, at 4:13 PM, Jonathan Creasy wrote: > > > I have an internal one working and was hoping to have it open sourced in > > the next week. The one at Box is based on the CodeIgniter framework, we > > have about 45 RESTful interfaces built on this framework so I just put > > together another one for Kafka. > > > > > > Here are my notes, these were pre-dev so may be a little different than > > what we ended up with. > > > > https://cwiki.apache.org/confluence/display/KAFKA/Restful+API+Proposal> > > > I will read yours later this afternoon, we should work together. > > > > -Jonathan > > > > > > On Fri, Aug 3, 2012 at 7:41 AM, David Arthur <[EMAIL PROTECTED]> wrote: > > > >> I'd like to tackle this project (assuming it hasn't been started yet). > >> > >> I wrote up some initial thoughts here: https://gist.github.com/3248179> >> > >> TLDR; use Range header for specifying offsets, simple URIs like > >> /kafka/topics/[topic]/[partition], use for a simple transport of bytes > >> and/or represent the messages as some media type (text, json, xml) > >> > >> Feedback is most welcome (in the Gist or in this thread). > >> > >> Cheers! > >> > >> -David > >
+
David Arthur 2012-08-10, 16:54
-
Re: Kafka REST interface
Jay Kreps 2012-08-10, 19:50
My personal preference would be to have only a single protocol in kafka core. I have been down the multiple protocol route and my experience was that it adds a lot of burden for each change that needs to be made and a lot of complexity to abstract over the different protocols. From the point of view of a user they are generally a bit agnostic as to how bytes are sent back and forth provided it is reliable and easily implementable in any language. Generally they care more about the quality of the client in their language of choice. My belief is that the main benefit of REST is ease of implementing a client. But currently the biggest barrier is really the use of zk and fairly thick consumer design. So I think the current thinking is that we should focus on thinning that out and removing the client-side zk dependency. I actually don't think TCP is a huge burden if the protocol is simple, and there are actually some advantages (for example the consumer needs to consume from multiple servers so select/poll/epoll is natural but this is not always available from HTTP client libraries). Basically this is an area where I think it is best to pick one way and really make it really bullet proof rather than providing lots of options. In some sense each option tends to increase the complexity of testing (since now there are many combinations to try) and also of implementation (since now a lot things that were concrete now need to be abstracted away). So from this perspective I would prefer a standalone proxy that could evolve independently rather than retro-fitting the current socket server to handle other protocols. There will be some overhead for the extra hop, but then there is some overhead for HTTP itself. This is just my personal opinion, it would be great to hear what other think. -Jay On Mon, Aug 6, 2012 at 5:39 AM, David Arthur <[EMAIL PROTECTED]> wrote: > I'd be happy to collaborate on this, though it's been a while since I've > used PHP. > > From what it looks like, what you have is a true proxy that runs outside > of Kafka and translates some REST routes into Kafka client calls. This > sounds more in line with what the project page describes. What I have > proposed is more like a translation layer between some REST routes and > FetchRequests. In this case the client is responsible for managing offsets. > Using the consumer groups and ZooKeeper would be another nice way of > consuming messages (which is probably more like what you have). > > Any maintainers have feedback on this? > > On Aug 3, 2012, at 4:13 PM, Jonathan Creasy wrote: > > > I have an internal one working and was hoping to have it open sourced in > > the next week. The one at Box is based on the CodeIgniter framework, we > > have about 45 RESTful interfaces built on this framework so I just put > > together another one for Kafka. > > > > > > Here are my notes, these were pre-dev so may be a little different than > > what we ended up with. > > > > https://cwiki.apache.org/confluence/display/KAFKA/Restful+API+Proposal> > > > I will read yours later this afternoon, we should work together. > > > > -Jonathan > > > > > > On Fri, Aug 3, 2012 at 7:41 AM, David Arthur <[EMAIL PROTECTED]> wrote: > > > >> I'd like to tackle this project (assuming it hasn't been started yet). > >> > >> I wrote up some initial thoughts here: https://gist.github.com/3248179> >> > >> TLDR; use Range header for specifying offsets, simple URIs like > >> /kafka/topics/[topic]/[partition], use for a simple transport of bytes > >> and/or represent the messages as some media type (text, json, xml) > >> > >> Feedback is most welcome (in the Gist or in this thread). > >> > >> Cheers! > >> > >> -David > >
+
Jay Kreps 2012-08-10, 19:50
-
Re: Kafka REST interface
David Arthur 2012-08-11, 03:15
You make a good point about TCP not being the problem. But rather the complex consumer logic and ZK dependency are barriers to entry for new clients. I'm starting to like the idea of a standalone proxy that uses an existing client (one with batteries included) to simply things for HTTP clients. If this is going into Kafka, I think we should stick with Scala/Java for implementation. Are there any preferences for what HTTP server is used? Any particular aversion to Jetty? -David On Aug 10, 2012, at 3:50 PM, Jay Kreps wrote: > My personal preference would be to have only a single protocol in kafka > core. I have been down the multiple protocol route and my experience was > that it adds a lot of burden for each change that needs to be made and a > lot of complexity to abstract over the different protocols. From the point > of view of a user they are generally a bit agnostic as to how bytes are > sent back and forth provided it is reliable and easily implementable in any > language. Generally they care more about the quality of the client in their > language of choice. > > My belief is that the main benefit of REST is ease of implementing a > client. But currently the biggest barrier is really the use of zk and > fairly thick consumer design. So I think the current thinking is that we > should focus on thinning that out and removing the client-side zk > dependency. I actually don't think TCP is a huge burden if the protocol is > simple, and there are actually some advantages (for example the consumer > needs to consume from multiple servers so select/poll/epoll is natural but > this is not always available from HTTP client libraries). > > Basically this is an area where I think it is best to pick one way and > really make it really bullet proof rather than providing lots of options. > In some sense each option tends to increase the complexity of testing > (since now there are many combinations to try) and also of implementation > (since now a lot things that were concrete now need to be abstracted away). > > So from this perspective I would prefer a standalone proxy that could > evolve independently rather than retro-fitting the current socket server to > handle other protocols. There will be some overhead for the extra hop, but > then there is some overhead for HTTP itself. > > This is just my personal opinion, it would be great to hear what other > think. > > -Jay > > On Mon, Aug 6, 2012 at 5:39 AM, David Arthur <[EMAIL PROTECTED]> wrote: > >> I'd be happy to collaborate on this, though it's been a while since I've >> used PHP. >> >> From what it looks like, what you have is a true proxy that runs outside >> of Kafka and translates some REST routes into Kafka client calls. This >> sounds more in line with what the project page describes. What I have >> proposed is more like a translation layer between some REST routes and >> FetchRequests. In this case the client is responsible for managing offsets. >> Using the consumer groups and ZooKeeper would be another nice way of >> consuming messages (which is probably more like what you have). >> >> Any maintainers have feedback on this? >> >> On Aug 3, 2012, at 4:13 PM, Jonathan Creasy wrote: >> >>> I have an internal one working and was hoping to have it open sourced in >>> the next week. The one at Box is based on the CodeIgniter framework, we >>> have about 45 RESTful interfaces built on this framework so I just put >>> together another one for Kafka. >>> >>> >>> Here are my notes, these were pre-dev so may be a little different than >>> what we ended up with. >>> >>> https://cwiki.apache.org/confluence/display/KAFKA/Restful+API+Proposal>>> >>> I will read yours later this afternoon, we should work together. >>> >>> -Jonathan >>> >>> >>> On Fri, Aug 3, 2012 at 7:41 AM, David Arthur <[EMAIL PROTECTED]> wrote: >>> >>>> I'd like to tackle this project (assuming it hasn't been started yet). >>>> >>>> I wrote up some initial thoughts here: https://gist.github.com/3248179
+
David Arthur 2012-08-11, 03:15
-
Re: Kafka REST interface
Taylor Gautier 2012-08-12, 14:39
Jay I agree with you 100%. At Tagged we have implemented a proxy for various internal reasons ( primarily to act as a high performance relay from PHP to Kafka). It's implemented in Node.js (JavaScript) Currently it services UDP packets encoded in binary but it could easily be modified to accept http also since Node support for http is pretty simple. If others are interested in maintaining something like this we could consider adding this to the public domain along side the already existing Node.js client implementation. On Aug 10, 2012, at 3:51 PM, Jay Kreps <[EMAIL PROTECTED]> wrote: > My personal preference would be to have only a single protocol in kafka > core. I have been down the multiple protocol route and my experience was > that it adds a lot of burden for each change that needs to be made and a > lot of complexity to abstract over the different protocols. From the point > of view of a user they are generally a bit agnostic as to how bytes are > sent back and forth provided it is reliable and easily implementable in any > language. Generally they care more about the quality of the client in their > language of choice. > > My belief is that the main benefit of REST is ease of implementing a > client. But currently the biggest barrier is really the use of zk and > fairly thick consumer design. So I think the current thinking is that we > should focus on thinning that out and removing the client-side zk > dependency. I actually don't think TCP is a huge burden if the protocol is > simple, and there are actually some advantages (for example the consumer > needs to consume from multiple servers so select/poll/epoll is natural but > this is not always available from HTTP client libraries). > > Basically this is an area where I think it is best to pick one way and > really make it really bullet proof rather than providing lots of options. > In some sense each option tends to increase the complexity of testing > (since now there are many combinations to try) and also of implementation > (since now a lot things that were concrete now need to be abstracted away). > > So from this perspective I would prefer a standalone proxy that could > evolve independently rather than retro-fitting the current socket server to > handle other protocols. There will be some overhead for the extra hop, but > then there is some overhead for HTTP itself. > > This is just my personal opinion, it would be great to hear what other > think. > > -Jay > > On Mon, Aug 6, 2012 at 5:39 AM, David Arthur <[EMAIL PROTECTED]> wrote: > >> I'd be happy to collaborate on this, though it's been a while since I've >> used PHP. >> >> From what it looks like, what you have is a true proxy that runs outside >> of Kafka and translates some REST routes into Kafka client calls. This >> sounds more in line with what the project page describes. What I have >> proposed is more like a translation layer between some REST routes and >> FetchRequests. In this case the client is responsible for managing offsets. >> Using the consumer groups and ZooKeeper would be another nice way of >> consuming messages (which is probably more like what you have). >> >> Any maintainers have feedback on this? >> >> On Aug 3, 2012, at 4:13 PM, Jonathan Creasy wrote: >> >>> I have an internal one working and was hoping to have it open sourced in >>> the next week. The one at Box is based on the CodeIgniter framework, we >>> have about 45 RESTful interfaces built on this framework so I just put >>> together another one for Kafka. >>> >>> >>> Here are my notes, these were pre-dev so may be a little different than >>> what we ended up with. >>> >>> https://cwiki.apache.org/confluence/display/KAFKA/Restful+API+Proposal>>> >>> I will read yours later this afternoon, we should work together. >>> >>> -Jonathan >>> >>> >>> On Fri, Aug 3, 2012 at 7:41 AM, David Arthur <[EMAIL PROTECTED]> wrote: >>> >>>> I'd like to tackle this project (assuming it hasn't been started yet). >>>> >>>> I wrote up some initial thoughts here: https://gist.github.com/3248179
+
Taylor Gautier 2012-08-12, 14:39
-
Re: Kafka REST interface
David Arthur 2012-08-24, 16:37
Here is an initial pass at a Kafka REST proxy (in Scala) https://github.com/mumrah/kafka/blob/rest/contrib/rest-proxy/src/main/scala/RESTServer.scalaThe basic gist is: * Jetty for webserver * Messages are strings * GET /topic/group to get a message (timeout after 1s) * POST /topic, the request body is the message * One consumer thread per topic+group Be wary, many things are hard coded at this point (port numbers, etc). Obviously, this will need to change. Also, I haven't the slightest idea how to setup/use sbt properly, so I just checked in the libs. Feedback is welcome in this thread or on Github. Be gentle please, this is my first go at Scala -David On Aug 12, 2012, at 10:39 AM, Taylor Gautier wrote: > Jay I agree with you 100%. > > At Tagged we have implemented a proxy for various internal reasons ( > primarily to act as a high performance relay from PHP to Kafka). It's > implemented in Node.js (JavaScript) > > Currently it services UDP packets encoded in binary but it could > easily be modified to accept http also since Node support for http is > pretty simple. > > If others are interested in maintaining something like this we could > consider adding this to the public domain along side the already > existing Node.js client implementation. > > > > On Aug 10, 2012, at 3:51 PM, Jay Kreps <[EMAIL PROTECTED]> wrote: > >> My personal preference would be to have only a single protocol in kafka >> core. I have been down the multiple protocol route and my experience was >> that it adds a lot of burden for each change that needs to be made and a >> lot of complexity to abstract over the different protocols. From the point >> of view of a user they are generally a bit agnostic as to how bytes are >> sent back and forth provided it is reliable and easily implementable in any >> language. Generally they care more about the quality of the client in their >> language of choice. >> >> My belief is that the main benefit of REST is ease of implementing a >> client. But currently the biggest barrier is really the use of zk and >> fairly thick consumer design. So I think the current thinking is that we >> should focus on thinning that out and removing the client-side zk >> dependency. I actually don't think TCP is a huge burden if the protocol is >> simple, and there are actually some advantages (for example the consumer >> needs to consume from multiple servers so select/poll/epoll is natural but >> this is not always available from HTTP client libraries). >> >> Basically this is an area where I think it is best to pick one way and >> really make it really bullet proof rather than providing lots of options. >> In some sense each option tends to increase the complexity of testing >> (since now there are many combinations to try) and also of implementation >> (since now a lot things that were concrete now need to be abstracted away). >> >> So from this perspective I would prefer a standalone proxy that could >> evolve independently rather than retro-fitting the current socket server to >> handle other protocols. There will be some overhead for the extra hop, but >> then there is some overhead for HTTP itself. >> >> This is just my personal opinion, it would be great to hear what other >> think. >> >> -Jay >> >> On Mon, Aug 6, 2012 at 5:39 AM, David Arthur <[EMAIL PROTECTED]> wrote: >> >>> I'd be happy to collaborate on this, though it's been a while since I've >>> used PHP. >>> >>> From what it looks like, what you have is a true proxy that runs outside >>> of Kafka and translates some REST routes into Kafka client calls. This >>> sounds more in line with what the project page describes. What I have >>> proposed is more like a translation layer between some REST routes and >>> FetchRequests. In this case the client is responsible for managing offsets. >>> Using the consumer groups and ZooKeeper would be another nice way of >>> consuming messages (which is probably more like what you have). >>> >>> Any maintainers have feedback on this?
+
David Arthur 2012-08-24, 16:37
-
Re: Kafka REST interface
David Arthur 2012-09-10, 13:49
Bump. Anyone have feedback on this approach? -David On Aug 24, 2012, at 12:37 PM, David Arthur wrote: > Here is an initial pass at a Kafka REST proxy (in Scala) > > https://github.com/mumrah/kafka/blob/rest/contrib/rest-proxy/src/main/scala/RESTServer.scala> > The basic gist is: > * Jetty for webserver > * Messages are strings > * GET /topic/group to get a message (timeout after 1s) > * POST /topic, the request body is the message > * One consumer thread per topic+group > > Be wary, many things are hard coded at this point (port numbers, etc). Obviously, this will need to change. Also, I haven't the slightest idea how to setup/use sbt properly, so I just checked in the libs. > > Feedback is welcome in this thread or on Github. Be gentle please, this is my first go at Scala > > -David > > On Aug 12, 2012, at 10:39 AM, Taylor Gautier wrote: > >> Jay I agree with you 100%. >> >> At Tagged we have implemented a proxy for various internal reasons ( >> primarily to act as a high performance relay from PHP to Kafka). It's >> implemented in Node.js (JavaScript) >> >> Currently it services UDP packets encoded in binary but it could >> easily be modified to accept http also since Node support for http is >> pretty simple. >> >> If others are interested in maintaining something like this we could >> consider adding this to the public domain along side the already >> existing Node.js client implementation. >> >> >> >> On Aug 10, 2012, at 3:51 PM, Jay Kreps <[EMAIL PROTECTED]> wrote: >> >>> My personal preference would be to have only a single protocol in kafka >>> core. I have been down the multiple protocol route and my experience was >>> that it adds a lot of burden for each change that needs to be made and a >>> lot of complexity to abstract over the different protocols. From the point >>> of view of a user they are generally a bit agnostic as to how bytes are >>> sent back and forth provided it is reliable and easily implementable in any >>> language. Generally they care more about the quality of the client in their >>> language of choice. >>> >>> My belief is that the main benefit of REST is ease of implementing a >>> client. But currently the biggest barrier is really the use of zk and >>> fairly thick consumer design. So I think the current thinking is that we >>> should focus on thinning that out and removing the client-side zk >>> dependency. I actually don't think TCP is a huge burden if the protocol is >>> simple, and there are actually some advantages (for example the consumer >>> needs to consume from multiple servers so select/poll/epoll is natural but >>> this is not always available from HTTP client libraries). >>> >>> Basically this is an area where I think it is best to pick one way and >>> really make it really bullet proof rather than providing lots of options. >>> In some sense each option tends to increase the complexity of testing >>> (since now there are many combinations to try) and also of implementation >>> (since now a lot things that were concrete now need to be abstracted away). >>> >>> So from this perspective I would prefer a standalone proxy that could >>> evolve independently rather than retro-fitting the current socket server to >>> handle other protocols. There will be some overhead for the extra hop, but >>> then there is some overhead for HTTP itself. >>> >>> This is just my personal opinion, it would be great to hear what other >>> think. >>> >>> -Jay >>> >>> On Mon, Aug 6, 2012 at 5:39 AM, David Arthur <[EMAIL PROTECTED]> wrote: >>> >>>> I'd be happy to collaborate on this, though it's been a while since I've >>>> used PHP. >>>> >>>> From what it looks like, what you have is a true proxy that runs outside >>>> of Kafka and translates some REST routes into Kafka client calls. This >>>> sounds more in line with what the project page describes. What I have >>>> proposed is more like a translation layer between some REST routes and >>>> FetchRequests. In this case the client is responsible for managing offsets.
+
David Arthur 2012-09-10, 13:49
-
Re: Kafka REST interface
David Arthur 2012-11-20, 21:08
Another bump for this thread... For those just joining, this prototype is a simple HTTP server that proxies the complex consumer code through two HTTP endpoints. https://github.com/mumrah/kafka/blob/rest/contrib/rest-proxy/src/main/scala/RESTServer.scalaE.g., curl http://localhost:8888/my-topic -X POST -d 'Here is a message' and curl http://localhost:8888/my-topic/my-group -X GET This is not an attempt to expose the FetchRequest/ProduceRequest protocol over HTTP. Few questions: * Would including offsets be useful here? Since it is utilizing the ZK-backed consumer code, I would think not * I have chosen to create one thread per topic+group (mostly for simplicity sake). Multiple REST servers could be run and load balanced across to increase the consumer parallelism. Maybe it would make sense for an individual REST server to create more than one thread per topic+group? Cheers -David On Sep 10, 2012, at 9:49 AM, David Arthur wrote: > Bump. > > Anyone have feedback on this approach? > > -David > > On Aug 24, 2012, at 12:37 PM, David Arthur wrote: > >> Here is an initial pass at a Kafka REST proxy (in Scala) >> >> https://github.com/mumrah/kafka/blob/rest/contrib/rest-proxy/src/main/scala/RESTServer.scala>> >> The basic gist is: >> * Jetty for webserver >> * Messages are strings >> * GET /topic/group to get a message (timeout after 1s) >> * POST /topic, the request body is the message >> * One consumer thread per topic+group >> >> Be wary, many things are hard coded at this point (port numbers, etc). Obviously, this will need to change. Also, I haven't the slightest idea how to setup/use sbt properly, so I just checked in the libs. >> >> Feedback is welcome in this thread or on Github. Be gentle please, this is my first go at Scala >> >> -David >> >> On Aug 12, 2012, at 10:39 AM, Taylor Gautier wrote: >> >>> Jay I agree with you 100%. >>> >>> At Tagged we have implemented a proxy for various internal reasons ( >>> primarily to act as a high performance relay from PHP to Kafka). It's >>> implemented in Node.js (JavaScript) >>> >>> Currently it services UDP packets encoded in binary but it could >>> easily be modified to accept http also since Node support for http is >>> pretty simple. >>> >>> If others are interested in maintaining something like this we could >>> consider adding this to the public domain along side the already >>> existing Node.js client implementation. >>> >>> >>> >>> On Aug 10, 2012, at 3:51 PM, Jay Kreps <[EMAIL PROTECTED]> wrote: >>> >>>> My personal preference would be to have only a single protocol in kafka >>>> core. I have been down the multiple protocol route and my experience was >>>> that it adds a lot of burden for each change that needs to be made and a >>>> lot of complexity to abstract over the different protocols. From the point >>>> of view of a user they are generally a bit agnostic as to how bytes are >>>> sent back and forth provided it is reliable and easily implementable in any >>>> language. Generally they care more about the quality of the client in their >>>> language of choice. >>>> >>>> My belief is that the main benefit of REST is ease of implementing a >>>> client. But currently the biggest barrier is really the use of zk and >>>> fairly thick consumer design. So I think the current thinking is that we >>>> should focus on thinning that out and removing the client-side zk >>>> dependency. I actually don't think TCP is a huge burden if the protocol is >>>> simple, and there are actually some advantages (for example the consumer >>>> needs to consume from multiple servers so select/poll/epoll is natural but >>>> this is not always available from HTTP client libraries). >>>> >>>> Basically this is an area where I think it is best to pick one way and >>>> really make it really bullet proof rather than providing lots of options. >>>> In some sense each option tends to increase the complexity of testing >>>> (since now there are many combinations to try) and also of implementation
+
David Arthur 2012-11-20, 21:08
-
Re: Kafka REST interface
David Arthur 2012-11-20, 22:06
BTW, here are some cURL calls from my test environment: https://gist.github.com/e59b9c8ee4ae56dad44fOn Nov 20, 2012, at 4:08 PM, David Arthur wrote: > Another bump for this thread... > > For those just joining, this prototype is a simple HTTP server that proxies the complex consumer code through two HTTP endpoints. > > https://github.com/mumrah/kafka/blob/rest/contrib/rest-proxy/src/main/scala/RESTServer.scala> > E.g., > > curl http://localhost:8888/my-topic -X POST -d 'Here is a message' > > and > > curl http://localhost:8888/my-topic/my-group -X GET > > > This is not an attempt to expose the FetchRequest/ProduceRequest protocol over HTTP. > > Few questions: > > * Would including offsets be useful here? Since it is utilizing the ZK-backed consumer code, I would think not > * I have chosen to create one thread per topic+group (mostly for simplicity sake). Multiple REST servers could be run and load balanced across to increase the consumer parallelism. Maybe it would make sense for an individual REST server to create more than one thread per topic+group? > > Cheers > -David > > On Sep 10, 2012, at 9:49 AM, David Arthur wrote: > >> Bump. >> >> Anyone have feedback on this approach? >> >> -David >> >> On Aug 24, 2012, at 12:37 PM, David Arthur wrote: >> >>> Here is an initial pass at a Kafka REST proxy (in Scala) >>> >>> https://github.com/mumrah/kafka/blob/rest/contrib/rest-proxy/src/main/scala/RESTServer.scala>>> >>> The basic gist is: >>> * Jetty for webserver >>> * Messages are strings >>> * GET /topic/group to get a message (timeout after 1s) >>> * POST /topic, the request body is the message >>> * One consumer thread per topic+group >>> >>> Be wary, many things are hard coded at this point (port numbers, etc). Obviously, this will need to change. Also, I haven't the slightest idea how to setup/use sbt properly, so I just checked in the libs. >>> >>> Feedback is welcome in this thread or on Github. Be gentle please, this is my first go at Scala >>> >>> -David >>> >>> On Aug 12, 2012, at 10:39 AM, Taylor Gautier wrote: >>> >>>> Jay I agree with you 100%. >>>> >>>> At Tagged we have implemented a proxy for various internal reasons ( >>>> primarily to act as a high performance relay from PHP to Kafka). It's >>>> implemented in Node.js (JavaScript) >>>> >>>> Currently it services UDP packets encoded in binary but it could >>>> easily be modified to accept http also since Node support for http is >>>> pretty simple. >>>> >>>> If others are interested in maintaining something like this we could >>>> consider adding this to the public domain along side the already >>>> existing Node.js client implementation. >>>> >>>> >>>> >>>> On Aug 10, 2012, at 3:51 PM, Jay Kreps <[EMAIL PROTECTED]> wrote: >>>> >>>>> My personal preference would be to have only a single protocol in kafka >>>>> core. I have been down the multiple protocol route and my experience was >>>>> that it adds a lot of burden for each change that needs to be made and a >>>>> lot of complexity to abstract over the different protocols. From the point >>>>> of view of a user they are generally a bit agnostic as to how bytes are >>>>> sent back and forth provided it is reliable and easily implementable in any >>>>> language. Generally they care more about the quality of the client in their >>>>> language of choice. >>>>> >>>>> My belief is that the main benefit of REST is ease of implementing a >>>>> client. But currently the biggest barrier is really the use of zk and >>>>> fairly thick consumer design. So I think the current thinking is that we >>>>> should focus on thinning that out and removing the client-side zk >>>>> dependency. I actually don't think TCP is a huge burden if the protocol is >>>>> simple, and there are actually some advantages (for example the consumer >>>>> needs to consume from multiple servers so select/poll/epoll is natural but >>>>> this is not always available from HTTP client libraries). >>>>
+
David Arthur 2012-11-20, 22:06
-
Re: Kafka REST interface
Taylor Gautier 2012-11-21, 15:54
It would make sense to use nio rather than threaded io. On Nov 20, 2012, at 2:06 PM, David Arthur <[EMAIL PROTECTED]> wrote: > BTW, here are some cURL calls from my test environment: > > https://gist.github.com/e59b9c8ee4ae56dad44f> > > On Nov 20, 2012, at 4:08 PM, David Arthur wrote: > >> Another bump for this thread... >> >> For those just joining, this prototype is a simple HTTP server that proxies the complex consumer code through two HTTP endpoints. >> >> https://github.com/mumrah/kafka/blob/rest/contrib/rest-proxy/src/main/scala/RESTServer.scala>> >> E.g., >> >> curl http://localhost:8888/my-topic -X POST -d 'Here is a message' >> >> and >> >> curl http://localhost:8888/my-topic/my-group -X GET >> >> >> This is not an attempt to expose the FetchRequest/ProduceRequest protocol over HTTP. >> >> Few questions: >> >> * Would including offsets be useful here? Since it is utilizing the ZK-backed consumer code, I would think not >> * I have chosen to create one thread per topic+group (mostly for simplicity sake). Multiple REST servers could be run and load balanced across to increase the consumer parallelism. Maybe it would make sense for an individual REST server to create more than one thread per topic+group? >> >> Cheers >> -David >> >> On Sep 10, 2012, at 9:49 AM, David Arthur wrote: >> >>> Bump. >>> >>> Anyone have feedback on this approach? >>> >>> -David >>> >>> On Aug 24, 2012, at 12:37 PM, David Arthur wrote: >>> >>>> Here is an initial pass at a Kafka REST proxy (in Scala) >>>> >>>> https://github.com/mumrah/kafka/blob/rest/contrib/rest-proxy/src/main/scala/RESTServer.scala>>>> >>>> The basic gist is: >>>> * Jetty for webserver >>>> * Messages are strings >>>> * GET /topic/group to get a message (timeout after 1s) >>>> * POST /topic, the request body is the message >>>> * One consumer thread per topic+group >>>> >>>> Be wary, many things are hard coded at this point (port numbers, etc). Obviously, this will need to change. Also, I haven't the slightest idea how to setup/use sbt properly, so I just checked in the libs. >>>> >>>> Feedback is welcome in this thread or on Github. Be gentle please, this is my first go at Scala >>>> >>>> -David >>>> >>>> On Aug 12, 2012, at 10:39 AM, Taylor Gautier wrote: >>>> >>>>> Jay I agree with you 100%. >>>>> >>>>> At Tagged we have implemented a proxy for various internal reasons ( >>>>> primarily to act as a high performance relay from PHP to Kafka). It's >>>>> implemented in Node.js (JavaScript) >>>>> >>>>> Currently it services UDP packets encoded in binary but it could >>>>> easily be modified to accept http also since Node support for http is >>>>> pretty simple. >>>>> >>>>> If others are interested in maintaining something like this we could >>>>> consider adding this to the public domain along side the already >>>>> existing Node.js client implementation. >>>>> >>>>> >>>>> >>>>> On Aug 10, 2012, at 3:51 PM, Jay Kreps <[EMAIL PROTECTED]> wrote: >>>>> >>>>>> My personal preference would be to have only a single protocol in kafka >>>>>> core. I have been down the multiple protocol route and my experience was >>>>>> that it adds a lot of burden for each change that needs to be made and a >>>>>> lot of complexity to abstract over the different protocols. From the point >>>>>> of view of a user they are generally a bit agnostic as to how bytes are >>>>>> sent back and forth provided it is reliable and easily implementable in any >>>>>> language. Generally they care more about the quality of the client in their >>>>>> language of choice. >>>>>> >>>>>> My belief is that the main benefit of REST is ease of implementing a >>>>>> client. But currently the biggest barrier is really the use of zk and >>>>>> fairly thick consumer design. So I think the current thinking is that we >>>>>> should focus on thinning that out and removing the client-side zk >>>>>> dependency. I actually don't think TCP is a huge burden if the protocol is >>>
+
Taylor Gautier 2012-11-21, 15:54
|
|