|
|
-
kafka and ruby client, beginner questions
S Ahmed 2012-05-05, 21:44
I have some messages already send to kafka using the producer perf test tool, but when I fire up irb I don't see any messages:
>irb require 'rubygems' require 'kafka'
c = Kafka::Consumer.new(:topic => "test") m1 = c.consume []
So it says it is empty.
If I create a loop like:
c.loop do |msg| puts "rec" puts msg end
And then I run the producer perf test, I see the messages outputted.
So it seems I am missing something, why does't it return any messages when I simply connect and their are messages already in the system, or does connecting to a specific topic push the offset to the end of the queue?
Now if I create a consumer, and then create a producer and send a message, then when I call consumer.consume I get a message. When I cat the /tmp/kafka-logs/test-0/000000...000.kafka file, I can see my message and it seems to be in string format like:
6??helloworld
Shouldn't it be encoded?
+
S Ahmed 2012-05-05, 21:44
-
Re: kafka and ruby client, beginner questions
Sebastian Eichner 2012-05-07, 08:07
Hi!
On Sat, May 5, 2012 at 11:44 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > c = Kafka::Consumer.new(:topic => "test") > m1 = c.consume > [] > > So it says it is empty.
The Ruby-Consumer consumes everything after it was started (in lib/kafka/consumer.rb you can see that the offset is automatically set to latest_offset on startup). You can set your own offset in the Consumer's constructor like Kafka::Consumer.new(:offset => 12345) if you want.
To receive messages via the normal consume-method you'd have to create them after the consumer was instantiated, so like: c = Kafka::Consumer.new(:topic => "test") # now run the producer m1 = c.consume # m1 now contains the newly created messages Sebastian
+
Sebastian Eichner 2012-05-07, 08:07
-
Re: kafka and ruby client, beginner questions
S Ahmed 2012-05-07, 13:47
Is it possible to query for a list of offsets?
>From what I understand, each offset is not for a single message, but could be a group of messages right?
On Mon, May 7, 2012 at 4:07 AM, Sebastian Eichner <[EMAIL PROTECTED] > wrote:
> Hi! > > On Sat, May 5, 2012 at 11:44 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > > c = Kafka::Consumer.new(:topic => "test") > > m1 = c.consume > > [] > > > > So it says it is empty. > > The Ruby-Consumer consumes everything after it was started (in > lib/kafka/consumer.rb you can see that the offset is automatically set > to latest_offset on startup). > You can set your own offset in the Consumer's constructor like > Kafka::Consumer.new(:offset => 12345) if you want. > > To receive messages via the normal consume-method you'd have to create > them after the consumer was instantiated, so like: > c = Kafka::Consumer.new(:topic => "test") > # now run the producer > m1 = c.consume > # m1 now contains the newly created messages > > > Sebastian >
+
S Ahmed 2012-05-07, 13:47
-
Re: kafka and ruby client, beginner questions
Sebastian Eichner 2012-05-07, 13:52
On Mon, May 7, 2012 at 3:47 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > Is it possible to query for a list of offsets?
Your Ruby Consumer has exactly 1 offset. You can query it via consumer.offset (maybe take a look at the gem's sourcecode, its easy to understand).
> From what I understand, each offset is not for a single message, but could > be a group of messages right?
Not sure what you are referring to, but maybe you mix up the info on the Java Consumer, which handles multiple streams in parallel (and then does have multiple offsets). The Ruby consumer is much simpler (like the Java SimpleConsumer) and just works single-threaded with one offset.
Sebastian
+
Sebastian Eichner 2012-05-07, 13:52
-
Re: kafka and ruby client, beginner questions
S Ahmed 2012-05-07, 13:57
Ok sorry I think I was confused.
Once I have the offset, say I have the first offset (would it be 0?), I then call consume and it gets me the next message, and I keep calling consume till it reaches the end. On Mon, May 7, 2012 at 9:52 AM, Sebastian Eichner <[EMAIL PROTECTED] > wrote:
> On Mon, May 7, 2012 at 3:47 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > > Is it possible to query for a list of offsets? > > Your Ruby Consumer has exactly 1 offset. You can query it via > consumer.offset (maybe take a look at the gem's sourcecode, its easy > to understand). > > > From what I understand, each offset is not for a single message, but > could > > be a group of messages right? > > Not sure what you are referring to, but maybe you mix up the info on > the Java Consumer, which handles multiple streams in parallel (and > then does have multiple offsets). The Ruby consumer is much simpler > (like the Java SimpleConsumer) and just works single-threaded with one > offset. > > Sebastian >
+
S Ahmed 2012-05-07, 13:57
-
Re: kafka and ruby client, beginner questions
Felix GV 2012-05-08, 18:24
The way I understand it, if you batch your messages (by default, the setting is still 1, I think, so no batching) and you have compression enabled, then each valid offset does correspond to a whole batch of messages, which is what you were referring to, I think.
Keep in mind that I have not played around with compression yet, so take whatever I say with a grain of salt.
Also, if you have batching (>1) enabled, but no compression, I would assume that you can still get each individual message in a batch with its appropriate offset, rather than only being able to get an offset for the entire batch of messages, but I have not tested that.
Sorry for not being able to provide any definitive answer. Hopefully, an expert can chip in to confirm or infirm what I said...!
-- Felix
On Mon, May 7, 2012 at 9:57 AM, S Ahmed <[EMAIL PROTECTED]> wrote:
> Ok sorry I think I was confused. > > Once I have the offset, say I have the first offset (would it be 0?), I > then call consume and it gets me the next message, and I keep calling > consume till it reaches the end. > > > On Mon, May 7, 2012 at 9:52 AM, Sebastian Eichner < > [EMAIL PROTECTED] > > wrote: > > > On Mon, May 7, 2012 at 3:47 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > > > Is it possible to query for a list of offsets? > > > > Your Ruby Consumer has exactly 1 offset. You can query it via > > consumer.offset (maybe take a look at the gem's sourcecode, its easy > > to understand). > > > > > From what I understand, each offset is not for a single message, but > > could > > > be a group of messages right? > > > > Not sure what you are referring to, but maybe you mix up the info on > > the Java Consumer, which handles multiple streams in parallel (and > > then does have multiple offsets). The Ruby consumer is much simpler > > (like the Java SimpleConsumer) and just works single-threaded with one > > offset. > > > > Sebastian > > >
+
Felix GV 2012-05-08, 18:24
|
|