Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Event processing use case/examples


Copy link to this message
-
Event processing use case/examples
I am struggling on some core design concepts and I was hoping someone
could explaining how they use Kafka in their production event for event
processing. For example, I've read that LinkedIn has over 60+ metrics
they collect and aggregate.. ie page views, clicks etc. I clearly grasp
the concept of logging  a page view event to Kafka, but I'm missing the
last part. How does one go about aggregating this data and using it any
other way than a simple data sink.

Taking the "page_view" example further. What is the preferred way of
logging and consuming this event?  Would you have a consumer that just
consumes page views? If so, how do you go about making sure you dont
reconsume the same message in the event of a conusmer restart? Also for
analytical/reporting needs how do you deal with timeframes? Say my
consumer is subscribe to the "page_view" topic and I want all messages
from 8am-9am. Would I read all messages and filter out any that doesn't
have a specific timestamp, or would I create very a seperate topic for
each hour.. ie "page_view/08:00".  Same question applies to importing
all "page_views" for yesterday into Hadoop.

I know Kafka is a new project and im sure everyones time is constrained
but I think it would be helpful if some high level examples/use cases
and best practices were added to the wiki. This could help gain adoption
and hopeful bring in a more willing contributors :)

Thanks for your help
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB