Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> event ordering semantics

Copy link to this message
Re: event ordering semantics
If everything is working well flume will give you approximately in
order delivery. If transactions are being rolled back that order will
be considerably mixed.

I don't see us supporting delivery order anytime soon. Additionally,
for most flume use cases a global serial number could be assigned at
the source and then a MR job (or hbase) used to reorder the events
inside Hadoop.


On Tue, Nov 6, 2012 at 6:10 AM, Jan Van Besien <[EMAIL PROTECTED]> wrote:
> Hi,
> I am trying to figure out what are the exact semantics of flume with respect
> to event ordering. I'm interested in both single agent and multi agent
> setups.
> The current documentation (http://flume.apache.org/FlumeUserGuide.html)
> doesn't seem to mention anything about this topic.
> Older documentation and posts to this mailing list, seem to suggest that
> flume doesn't guarantee in order delivery of events, but also seem to
> suggest that "it depends".
> for example:
> - https://github.com/cloudera/flume/wiki/FAQ (last topic)
> -
> http://mail-archives.apache.org/mod_mbox/flume-user/201210.mbox/%[EMAIL PROTECTED]%3e
> Can somebody clarify this for me? If flume doesn't generally guarantee in
> order delivery, is it still achievable with a more restricted set of
> features? Or is this simply not a good idea, and should I look elsewhere
> with these kinds of requirements (or relax my requirements)?
> Thanks in advance,
> Jan

Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/