Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> random access


Copy link to this message
-
random access
I was thinking of replicating messages to a central location, and having a
very long expire date on the messages (like say 1 year).

My requirement would be able to not just stream messages, but access
messages by key, similiar to a "SELECT * FROM TABLE WHERE id=123"

>From I understand, currently their is no index file that maps messages to
their exact location in a file correct?  i.e. kafka streams the messages,
so it goes to a .kafka file, starts from the beginning and streams the data
to a consumer.  If your offset happends to be in the middle of the file, it
will scan the file, start at the beginning of the message, figure out the
length of the message, and then jump to the position of the next message
until it finds the correct message offset, is this correct?

i.e. I would have to create some sort of index that maps the offset to the
'messageId' (where the messageId is stored in the body of the message
itself).
+
Jay Kreps 2012-06-13, 14:01
+
S Ahmed 2012-06-13, 14:49
+
Jay Kreps 2012-06-13, 18:05
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB