Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Apache Kafka in AWS


Copy link to this message
-
Re: Apache Kafka in AWS
Thanks.  FWIW  this one has been fine so far

java version "1.7.0_13"
OpenJDK Runtime Environment (IcedTea7 2.3.6) (Ubuntu build 1.7.0_13-b20)
OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode)

though not running at the load in your tests.
On Wed, May 22, 2013 at 4:51 PM, Jason Weiss <[EMAIL PROTECTED]> wrote:

> [ec2-user@ip-10-194-5-76 ~]$ java -version
> java version "1.6.0_24"
> OpenJDK Runtime Environment (IcedTea6 1.11.11)
> (amazon-61.1.11.11.53.amzn1-x86_64)
> OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)
>
>
> Yes, as soon as I put it under heavy load, it would buckle almost
> consistently. I knew it was JDK related because I temporarily gave up on
> AWS, but I was able to run the same code on my MacBook Pro without issue.
> That's when I upgraded AWS to Oracle Java 7 64-bit and all my crashes
> disappeared under load.
>
> Jason
>
>
> ________________________________________
> From: Scott Clasen [[EMAIL PROTECTED]]
> Sent: Wednesday, May 22, 2013 19:27
> To: users
> Subject: Re: Apache Kafka in AWS
>
> Hey Jason,
>
>  question what openjdk version did you have issues with? Im running kafka
> on it now and has been ok. Was it a crash only at load?
>
> Thanks
> SC
>
>
> On Wed, May 22, 2013 at 1:42 PM, Jason Weiss <[EMAIL PROTECTED]>
> wrote:
>
> > All,
> >
> > I asked a number of questions of the group over the last week, and I'm
> > happy to report that I've had great success getting Kafka up and running
> in
> > AWS. I am using 3 EC2 instances, each of which is a M2 High-Memory
> > Quadruple Extra Large with 8 cores and 58.4 GiB of memory according to
> the
> > AWS specs. I have co-located Zookeeper instances next to Zafka on each
> > machine.
> >
> > I am able to publish in a repeatable fashion 273,000 events per second,
> > with each event payload consisting of a fixed size of 2048 bytes! This
> > represents the maximum throughput possible on this configuration, as the
> > servers became CPU constrained, averaging 97% utilization in a relatively
> > flat line. This isn't a "burst" speed – it represents a sustained
> > throughput from 20 M1 Large EC2 Kafka multi-threaded producers. Putting
> > this into perspective, if my log retention period was a month, I'd be
> > aggregating 1.3 petabytes of data on my disk drives. Suffice to say, I
> > don't see us retaining data for more than a few hours!
> >
> > Here were the keys to tuning for future folks to consider:
> >
> > First and foremost, be sure to configure your Java heap size accordingly
> > when you launch Kafka. The default is like 512MB, which in my case left
> > virtually all of my RAM inaccessible to Kafka.
> > Second, stay away from OpenJDK. No, seriously – this was a huge thorn in
> > my side, and I almost gave up on Kafka because of the problems I
> > encountered. The OpenJDK NIO functions repeatedly resulted in Kafka
> > crashing and burning in dramatic fashion. The moment I switched over to
> > Oracle's JDK for linux, Kafka didn't puke once- I mean, like not even a
> > hiccup.
> > Third know your message size. In my opinion, the more you understand
> about
> > your event payload characteristics, the better you can tune the system.
> The
> > two knobs to really turn are the log.flush.interval and
> > log.default.flush.interval.ms. The values here are intrinsically
> > connected to the types of payloads you are putting through the system.
> > Fourth and finally, to maximize throughput you have to code against the
> > async paradigm, and be prepared to tweak the batch size, queue
> properties,
> > and compression codec (wait for it…) in a way that matches the message
> > payload you are putting through the system and the capabilities of the
> > producer system itself.
> >
> >
> > Jason
> >
> >
> >
> >
> >
> > This electronic message contains information which may be confidential or
> > privileged. The information is intended for the use of the individual or
> > entity named above. If you are not the intended recipient, be aware that
>
 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB