Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> filter before flush to disk


+
S Ahmed 2012-05-15, 13:38
+
Jay Kreps 2012-05-15, 15:24
Copy link to this message
-
Re: filter before flush to disk
What do you mean?

"  I think the direction we are going
is instead to just let you co-locate this processing on the same box.
This gives the isolation of separate processes and the overhead of the
transfer over localhost is pretty minor. "
I see what your saying as it is a specific implemention/use case that
diverts from a general purpose mechanism, that's why I was suggesting maybe
a hook/event based system.

On Tue, May 15, 2012 at 11:24 AM, Jay Kreps <[EMAIL PROTECTED]> wrote:

> Yeah I see where you are going with that. We toyed with this idea, but
> the idea of coupling processing to the log storage raises a lot of
> problems for general purpose usage. I think the direction we are going
> is instead to just let you co-locate this processing on the same box.
> This gives the isolation of separate processes and the overhead of the
> transfer over localhost is pretty minor.
>
> -Jay
>
> On Tue, May 15, 2012 at 6:38 AM, S Ahmed <[EMAIL PROTECTED]> wrote:
> > Would it be possible to filter the collection before it gets flush to
> disk?
> >
> > Say I am tracking page views per user, and I could perform a rollup
> before
> > it gets flushed to disk (using a hashmap with the key being the
> sessionId,
> > and increment a counter for the duplicate entries).
> >
> > And could this be done w/o modifying the original source, maybe through
> > some sort of event/listener?
>
+
S Ahmed 2012-05-15, 15:43
+
S Ahmed 2012-05-17, 13:40
+
Jay Kreps 2012-05-17, 15:02
+
S Ahmed 2012-05-17, 21:32
+
Jay Kreps 2012-05-17, 22:34
+
S Ahmed 2012-05-29, 13:30
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB