Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Apache Kafka in AWS


Copy link to this message
-
Re: Apache Kafka in AWS
Thanks.  FWIW  this one has been fine so far

java version "1.7.0_13"
OpenJDK Runtime Environment (IcedTea7 2.3.6) (Ubuntu build 1.7.0_13-b20)
OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode)

though not running at the load in your tests.
On Wed, May 22, 2013 at 4:51 PM, Jason Weiss <[EMAIL PROTECTED]> wrote:

> [ec2-user@ip-10-194-5-76 ~]$ java -version
> java version "1.6.0_24"
> OpenJDK Runtime Environment (IcedTea6 1.11.11)
> (amazon-61.1.11.11.53.amzn1-x86_64)
> OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)
>
>
> Yes, as soon as I put it under heavy load, it would buckle almost
> consistently. I knew it was JDK related because I temporarily gave up on
> AWS, but I was able to run the same code on my MacBook Pro without issue.
> That's when I upgraded AWS to Oracle Java 7 64-bit and all my crashes
> disappeared under load.
>
> Jason
>
>
> ________________________________________
> From: Scott Clasen [[EMAIL PROTECTED]]
> Sent: Wednesday, May 22, 2013 19:27
> To: users
> Subject: Re: Apache Kafka in AWS
>
> Hey Jason,
>
>  question what openjdk version did you have issues with? Im running kafka
> on it now and has been ok. Was it a crash only at load?
>
> Thanks
> SC
>
>
> On Wed, May 22, 2013 at 1:42 PM, Jason Weiss <[EMAIL PROTECTED]>
> wrote:
>
> > All,
> >
> > I asked a number of questions of the group over the last week, and I'm
> > happy to report that I've had great success getting Kafka up and running
> in
> > AWS. I am using 3 EC2 instances, each of which is a M2 High-Memory
> > Quadruple Extra Large with 8 cores and 58.4 GiB of memory according to
> the
> > AWS specs. I have co-located Zookeeper instances next to Zafka on each
> > machine.
> >
> > I am able to publish in a repeatable fashion 273,000 events per second,
> > with each event payload consisting of a fixed size of 2048 bytes! This
> > represents the maximum throughput possible on this configuration, as the
> > servers became CPU constrained, averaging 97% utilization in a relatively
> > flat line. This isn't a "burst" speed – it represents a sustained
> > throughput from 20 M1 Large EC2 Kafka multi-threaded producers. Putting
> > this into perspective, if my log retention period was a month, I'd be
> > aggregating 1.3 petabytes of data on my disk drives. Suffice to say, I
> > don't see us retaining data for more than a few hours!
> >
> > Here were the keys to tuning for future folks to consider:
> >
> > First and foremost, be sure to configure your Java heap size accordingly
> > when you launch Kafka. The default is like 512MB, which in my case left
> > virtually all of my RAM inaccessible to Kafka.
> > Second, stay away from OpenJDK. No, seriously – this was a huge thorn in
> > my side, and I almost gave up on Kafka because of the problems I
> > encountered. The OpenJDK NIO functions repeatedly resulted in Kafka
> > crashing and burning in dramatic fashion. The moment I switched over to
> > Oracle's JDK for linux, Kafka didn't puke once- I mean, like not even a
> > hiccup.
> > Third know your message size. In my opinion, the more you understand
> about
> > your event payload characteristics, the better you can tune the system.
> The
> > two knobs to really turn are the log.flush.interval and
> > log.default.flush.interval.ms. The values here are intrinsically
> > connected to the types of payloads you are putting through the system.
> > Fourth and finally, to maximize throughput you have to code against the
> > async paradigm, and be prepared to tweak the batch size, queue
> properties,
> > and compression codec (wait for it…) in a way that matches the message
> > payload you are putting through the system and the capabilities of the
> > producer system itself.
> >
> >
> > Jason
> >
> >
> >
> >
> >
> > This electronic message contains information which may be confidential or
> > privileged. The information is intended for the use of the individual or
> > entity named above. If you are not the intended recipient, be aware that
>