Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - filter before flush to disk

Copy link to this message
filter before flush to disk
S Ahmed 2012-05-15, 13:38
Would it be possible to filter the collection before it gets flush to disk?

Say I am tracking page views per user, and I could perform a rollup before
it gets flushed to disk (using a hashmap with the key being the sessionId,
and increment a counter for the duplicate entries).

And could this be done w/o modifying the original source, maybe through
some sort of event/listener?