Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka, mail # user - 0.8 producer -- many questions


+
ben fleis 2012-11-23, 15:51
+
Jun Rao 2012-11-26, 05:30
+
Jason Rosenberg 2012-11-26, 06:53
+
ben fleis 2012-11-26, 11:20
Copy link to this message
-
Re: 0.8 producer -- many questions
Joel Koshy 2012-11-27, 00:32
> *Single socket multiplexing* -- the current protocol "standard header"
> doesn't include 'request_type_id'.  This implies that socket multiplexing
> is either unwelcome, or unexpected.  Is the expectation that I send
> Metadata requests via a separate socket?  Or instead that if I do send one,
> that the Metadata reply is prioritized over any other outstanding
> ProduceResponses?  (As an aside, I've assumed that ProduceResponse ordering
> from a single broker is undefined?  I use the correlation_id, but am
> curious if there are guarantees.)
>

The requestId for each request-type is written out to the wire (see
BoundedByteBufferSend)

*ProduceRequest + MessageAndOffset *-- I haven't learned scala that far
> yet, and the many layers of abstraction make it difficult (for the naïve to
> trace).  In the ProduceRequest's message_set, it kinda appears that it
> reads in an array of "MessageAndOffset".  Am I misreading?  If not, what is
> the offset, and why?  If so, what are those bytes?
>
>
> For the record, this is what I recorded in my pythonic protocol dummy code:
>
> def pack_produce_message_set(pusher, partition_id, msg_set):
>     pusher.push_int(partition_id)                   # partition_id
>     pusher.push_int_marker()                        # msg_set size
> (backpatched)
>     for msg in msg_set:
>         pusher.push_long(0)                         # offset (always 0?)
>         pusher.push_int_marker()                    # msg size
> (backpatched)
>         pusher.push_int_marker()                    # CRC32 (backpatched)
>         pusher.push_byte(_message_magic_byte)       # Magic byte
>         pusher.push_byte(0)                         # attributes (compr,
> codec)
>
The offsets are largely irrelevant to the producer request - i.e., the
entire data buffer is just dumped into the request. If you're writing a
native producer then you don't need to deal with offsets of each individual
message (i.e., at the time you actually write the data out to wire). The
best way to follow this is to trace through the writeTo method in
ProducerRequest.

Joel