Millions of messages per day (with each message being few bytes) is not
really 'Big Data'. Kafka has been tested for a million message per second.

The answer to all your question IMO is "It depends".

You can start with a single instance (Single machine installation). Let
your producer send messages. Keep one broker. Increase to N brokers. When
you touch the upper limit add a server and repeat all the stuff.

Bench marking and scalability are aspects which you should try on your own
by playing with Kafka. Every use case is different. So performance metric
of one is not a global answer.

For your question on Topic or Queue, please read something about
Distributed Computing Pub/Sub, Message Queue's and other patterns which are
generic concepts and has nothing to do with Kafka. It again depends on your
use case.

Please read as to what topics in Kafka are? If you just go through the
definition of topics you would yourself answer your question within a

Replications and all would be next steps once you are done with a single
running instance of Kafka. So go ahead and get your hands dirty. You will
love Kafka :)

And yes, the most important thing: Please read the documentation first (bit
of theory) and then dive. There is no silver bullet.


On Mon, Jul 22, 2013 at 4:27 PM, <[EMAIL PROTECTED]> wrote:
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB