Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> Apache Kafka in AWS


+
Jason Weiss 2013-05-22, 20:42
+
Neha Narkhede 2013-05-22, 20:57
+
Ken Krugler 2013-05-22, 21:24
+
Scott Clasen 2013-05-22, 23:27
+
Jonathan Hodges 2013-05-22, 23:11
+
Scott Clasen 2013-05-22, 23:56
+
Ken Krugler 2013-05-23, 01:00
+
Jun Rao 2013-05-23, 04:17
Copy link to this message
-
Re: Apache Kafka in AWS
Jason,

Unfortunately, Apache mailing lists don't support attachments. Could you
document your experience (with the graphs) in a blog (or a wiki page in
Kafka)?

Thanks,

Jun
On Thu, May 23, 2013 at 2:00 AM, Jason Weiss <[EMAIL PROTECTED]> wrote:

> Jun,
>
> Here is a screenshot from AWS's statistics (per-minute sampling is the
> finest granularity I believe that they chart). I don't have a screenshot of
> the top output.
>
> This shows when I added a 4th machine to the cluster with the same number
> of clients, my CPU utilization fell- but remained constant. The flatline is
> pretty obvious in the extended 4 minute test-- it ramps up, flat lines,
> then ramps down.
>
> Jason
>
> ________________________________________
> From: Jun Rao [[EMAIL PROTECTED]]
> Sent: Thursday, May 23, 2013 00:17
> To: [EMAIL PROTECTED]
> Subject: Re: Apache Kafka in AWS
>
> Jason,
>
> Thanks for sharing. This is very interesting. Normally, Kafka brokers don't
> use too much CPU. Are most of the 750% CPU actually used by Kafka brokers?
>
> Jun
>
>
> On Wed, May 22, 2013 at 6:11 PM, Jason Weiss <[EMAIL PROTECTED]>
> wrote:
>
> > >>Did you check that you were using all cores?
> >
> > top was reporting over 750%
> >
> > Jason
> >
> > ________________________________________
> > From: Ken Krugler [[EMAIL PROTECTED]]
> > Sent: Wednesday, May 22, 2013 20:59
> > To: [EMAIL PROTECTED]
> > Subject: Re: Apache Kafka in AWS
> >
> > Hi Jason,
> >
> > On May 22, 2013, at 3:35pm, Jason Weiss wrote:
> >
> > > Ken,
> > >
> > > Great question! I should have indicated I was using EBS, 500GB with
> 2000
> > provisioned IOPs.
> >
> > OK, thanks. Sounds like you were pegged on CPU usage.
> >
> > But that does surprise me a bit. Did you check that you were using all
> > cores?
> >
> > Thanks,
> >
> > -- Ken
> >
> > PS - back in 2006 I spent a week of hell debugging an occasion job
> failure
> > on Hadoop (this is when it was still part of Nutch). Turns out one of our
> > 12 slaves was accidentally using OpenJDK, and this had a JIT compiler bug
> > that would occasionally rear its ugly head. Obviously the Sun/Oracle JRE
> > isn't bug-free, but it gets a lot more stress testing. So one of my basic
> > guidelines in the ops portion of the Hadoop class I teach is that every
> > server must have exactly the same version of Oracle's JRE.
> >
> > > ________________________________________
> > > From: Ken Krugler [[EMAIL PROTECTED]]
> > > Sent: Wednesday, May 22, 2013 17:23
> > > To: [EMAIL PROTECTED]
> > > Subject: Re: Apache Kafka in AWS
> > >
> > > Hi Jason,
> > >
> > > Thanks for the notes.
> > >
> > > I'm curious whether you went with using local drives (ephemeral
> storage)
> > or EBS, and if with EBS then what IOPS.
> > >
> > > Thanks,
> > >
> > > -- Ken
> > >
> > > On May 22, 2013, at 1:42pm, Jason Weiss wrote:
> > >
> > >> All,
> > >>
> > >> I asked a number of questions of the group over the last week, and I'm
> > happy to report that I've had great success getting Kafka up and running
> in
> > AWS. I am using 3 EC2 instances, each of which is a M2 High-Memory
> > Quadruple Extra Large with 8 cores and 58.4 GiB of memory according to
> the
> > AWS specs. I have co-located Zookeeper instances next to Zafka on each
> > machine.
> > >>
> > >> I am able to publish in a repeatable fashion 273,000 events per
> second,
> > with each event payload consisting of a fixed size of 2048 bytes! This
> > represents the maximum throughput possible on this configuration, as the
> > servers became CPU constrained, averaging 97% utilization in a relatively
> > flat line. This isn't a "burst" speed – it represents a sustained
> > throughput from 20 M1 Large EC2 Kafka multi-threaded producers. Putting
> > this into perspective, if my log retention period was a month, I'd be
> > aggregating 1.3 petabytes of data on my disk drives. Suffice to say, I
> > don't see us retaining data for more than a few hours!
> > >>
> > >> Here were the keys to tuning for future folks to consider:

 
+
Jason Weiss 2013-05-23, 14:13
+
S Ahmed 2013-05-28, 19:48
+
S Ahmed 2013-05-29, 17:40
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB