Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Transactional writing


Copy link to this message
-
Re: Transactional writing
If you use Kafka just as a redo log, you can't undo anything that's written
to the log. Write-ahead logs in typical database systems are both redo and
undo logs. Transaction commits and rollbacks are implemented on top of the
logs. However, general-purpose write-ahead logs for transactions are much
more complicated.

Thanks,

Jun

On Fri, Oct 26, 2012 at 11:08 AM, Jay Kreps <[EMAIL PROTECTED]> wrote:

> This is an important feature and I am interested in helping out in the
> design and implementation, though I am working on 0.8 features for the next
> month so I may not be of too much use. I have thought a little bit about
> this, but I am not yet sure of the best approach.
>
> Here is a specific use case I think is important to address: consider a
> case where you are doing processing of one or more streams and producing an
> output stream. This processing may involve some kind of local state (say
> counters or other local aggregation intermediate state). This is a common
> scenario. The problem is to give reasonable semantics to this computation
> in the presence of failures. The processor effectively has a
> position/offset in each of its input streams as well as whatever local
> state. The problem is that if this process fails it needs to restore to a
> state that matches the last produced messages. There are several solutions
> to this problem. One is to make the output somehow idempotent, this will
> solve some cases but is not a general solution as many things cannot be
> made idempotent easily.
>
> I think the two proposals you give outline a couple of basic approaches:
> 1. Store the messages on the server somewhere but don't add them to the log
> until the commit call
> 2. Store the messages in the log but don't make them available to the
> consumer until the commit call
> Another option you didn't mention:
>
> I can give several subtleties to these approaches.
>
> One advantage of the second approach is that messages are in the log and
> can be available for reading or not. This makes it possible to support a
> kind of "dirty read" that allows the consumer to specify whether they want
> to immediately see all messages with low latency but potentially see
> uncommitted messages or only see committed messages.
>
> The problem with the second approach at least in the way you describe it is
> that you have to lock the log until the commit occurs otherwise you can't
> roll back (because otherwise someone else may have appended their own
> messages and you can't truncate the log). This would have all the problems
> of remote locks. I think this might be a deal-breaker.
>
> Another variation on the second approach would be the following: have each
> producer maintain an id and generation number. Keep a schedule of valid
> offset/id/generation numbers on the broker and only hand these out. This
> solution would support non-blocking multi-writer appends but requires more
> participation from the producer (i.e. getting a generation number and id).
>
> Cheers,
>
> -Jay
>
> On Thu, Oct 25, 2012 at 7:04 PM, Tom Brown <[EMAIL PROTECTED]> wrote:
>
> > I have come up with two different possibilities, both with different
> > trade-offs.
> >
> > The first would be to support "true" transactions by writing
> > transactional data into a temporary file and then copy it directly to
> > the end of the partition when the commit command is created. The
> > upside to this approach is that individual transactions can be larger
> > than a single batch, and more than one producer could conduct
> > transactions at once. The downside is the extra IO involved in writing
> > it and reading it from disk an extra time.
> >
> > The second would be to allow any number of messages to be appended to
> > a topic, but not move the "end of topic" offset until the commit was
> > received. If a rollback was received, or the producer timed out, the
> > partition could be truncated at the most recently recognized "end of
> > topic" offset. The upside is that there is very little extra IO (only
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB