Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # dev >> Question about offsets


Copy link to this message
-
Question about offsets
Fiddling with my Python client on 0.8, noticed something has changed
with offsets.

It seems that instead of a byte offset in the log file, the offset is
now a logical one. I had a few questions about this:

1) How is the byte offset determined by the broker? Since messages are
not fixed width, does it use an index or do a simple binary search?
2) Regarding compressed MessageSets, how are the offsets incremented?
Suppose I have the following MessageSet

MessageSet A
  - Message A1, normal message
  - Message A2, normal message
  - Message A3, compressed MessageSet B
  - MessageSet B
     - Message B1
     - Message B2

Assuming we start from 0, message A1 gets offset of 0, A2 of 1. Now I am
unclear how the numbering goes. Would A3 get offset 3, and B1 -> 4 B2 ->
5? Or are the offsets inside a compressed MessageSet not used?

Is it possible to request a message inside a compressed message set?

Also what about nested compression sets, what if Message B3 is itself a
compressed MessageSet (not that it makes sense, just curious what would
happen).

Thanks!
-David
 
+
Jay Kreps 2013-01-30, 05:01
+
David Arthur 2013-01-30, 13:54
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB