Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> Samza -- A YARN stream processing framework for Kafka

Jay Kreps 2013-08-23, 15:40
Jonathan Hodges 2013-08-27, 13:51
Copy link to this message
Re: Samza -- A YARN stream processing framework for Kafka
I can't answer the rest but the catchy name is from Gregor Samza. A
character from Kafka's novel called The Metamorphosis.

On Tue, Aug 27, 2013 at 6:51 AM, Jonathan Hodges <[EMAIL PROTECTED]> wrote:

> First off, I want to say this is awesome!  It has been great to see all the
> great YARN offerings being released lately.  I noticed Hadoop 2.x was
> recently voted beta so very exciting!
> Like many we use Storm for near real-time processing our Kafka based
> streams.  In addition we send this data to Hadoop for offline analysis.
>  Consolidating these three environments to one is a win by itself.  I also
> really like the fault tolerance and security features.  Are you guys using
> Samza in production yet at LinkedIn or still development?
> The local state approach is very interesting.  Are you guys using Databus
> for the feed of changes from the external stores?  Is something like
> Voldemort integrated locally for the key/value store?  Can you maintain
> multiple tables locally for stream processing?
> Since we are using Storm, do any latency comparisons exist?  Since Samza
> makes the fault tolerance/durability tradeoff to persist to disk on every
> hop between StreamTasks, it would seem to take a hit here.  That said we
> use Trident a good bit, so many of our topologies are already slowed by
> remote calls to Cassandra.
> I know it is fairly new, but were any comparisons against Spark Streaming
> considered?  They take a similar tact of maintaining state locally as
> opposed to external stores, but I believe they are limited on what can fit
> in memory.
> Finally where did the catchy name, Samza come from?
> Thanks!
> Jonathan
> On Fri, Aug 23, 2013 at 9:39 AM, Jay Kreps <[EMAIL PROTECTED]> wrote:
> > Hey guys,
> >
> > This may be relevant to people on this list. A few of us at LinkedIn have
> > been working on Samza, a stream processing framework built on YARN. We
> just
> > added this as an Apache Incubator project. We would love to get people's
> > feedback (and help!). Here are the docs:
> >
> > http://samza.incubator.apache.org
> >
> > If anyone has any questions I'm happy to discuss what we are up to. Our
> > mailing list is here:
> >
> > http://samza.incubator.apache.org/community/mailing-lists.html
> >
> > -Jay
> >