Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> filter before flush to disk


Copy link to this message
-
filter before flush to disk
Would it be possible to filter the collection before it gets flush to disk?

Say I am tracking page views per user, and I could perform a rollup before
it gets flushed to disk (using a hashmap with the key being the sessionId,
and increment a counter for the duplicate entries).

And could this be done w/o modifying the original source, maybe through
some sort of event/listener?
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB