Good point about compressed message sets. I think that that works and is simpler. We might still need the txn id to be able to do the application to the hashmap atomically, but this depends on a number of details that aren't really spec'd out, in particular how the replicas keep their hashmap fed and how we fail over when mastership changes.
-Jay On Tue, Dec 18, 2012 at 8:05 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
Okay I did some assessment of use cases we have which aren't using the default offset storage API and came up with one generalization. I would like to propose--add a generic metadata field to the offset api on a per-partition basis. So that would leave us with the following:
If you want to store a reference to any associated state (say an HDFS file name) so that if the consumption fails over the new consumer can start up with the same state, this would be a place to store that. It would not be intended to support large stuff (we could enforce a 1k limit or something, just something small or a reference on where to find the state (say a file name).
-Jay On Mon, Dec 17, 2012 at 10:45 AM, Jay Kreps <[EMAIL PROTECTED]> wrote:
I actually recommend we just punt on implementing persistence in zk entirely, otherwise we have to have an upgrade path to grandfather over existing zk data to the new format. Let's just add it in the API and only actually store it out when we redo the backend. We can handle the size limit then too.
-Jay On Thu, Dec 20, 2012 at 1:30 PM, David Arthur <[EMAIL PROTECTED]> wrote: