Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Samza -- A YARN stream processing framework for Kafka


Copy link to this message
-
Re: Samza -- A YARN stream processing framework for Kafka
Xavier Stevens 2013-08-27, 16:05
I can't answer the rest but the catchy name is from Gregor Samza. A
character from Kafka's novel called The Metamorphosis.

https://en.wikipedia.org/wiki/Gregor_Samsa#Gregor_Samsa
-Xavier
On Tue, Aug 27, 2013 at 6:51 AM, Jonathan Hodges <[EMAIL PROTECTED]> wrote:

> First off, I want to say this is awesome!  It has been great to see all the
> great YARN offerings being released lately.  I noticed Hadoop 2.x was
> recently voted beta so very exciting!
>
> Like many we use Storm for near real-time processing our Kafka based
> streams.  In addition we send this data to Hadoop for offline analysis.
>  Consolidating these three environments to one is a win by itself.  I also
> really like the fault tolerance and security features.  Are you guys using
> Samza in production yet at LinkedIn or still development?
>
> The local state approach is very interesting.  Are you guys using Databus
> for the feed of changes from the external stores?  Is something like
> Voldemort integrated locally for the key/value store?  Can you maintain
> multiple tables locally for stream processing?
>
> Since we are using Storm, do any latency comparisons exist?  Since Samza
> makes the fault tolerance/durability tradeoff to persist to disk on every
> hop between StreamTasks, it would seem to take a hit here.  That said we
> use Trident a good bit, so many of our topologies are already slowed by
> remote calls to Cassandra.
>
> I know it is fairly new, but were any comparisons against Spark Streaming
> considered?  They take a similar tact of maintaining state locally as
> opposed to external stores, but I believe they are limited on what can fit
> in memory.
>
> Finally where did the catchy name, Samza come from?
>
> Thanks!
> Jonathan
>
>
>
> On Fri, Aug 23, 2013 at 9:39 AM, Jay Kreps <[EMAIL PROTECTED]> wrote:
>
> > Hey guys,
> >
> > This may be relevant to people on this list. A few of us at LinkedIn have
> > been working on Samza, a stream processing framework built on YARN. We
> just
> > added this as an Apache Incubator project. We would love to get people's
> > feedback (and help!). Here are the docs:
> >
> > http://samza.incubator.apache.org
> >
> > If anyone has any questions I'm happy to discuss what we are up to. Our
> > mailing list is here:
> >
> > http://samza.incubator.apache.org/community/mailing-lists.html
> >
> > -Jay
> >
>