Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - since flushes are batched, is it still io intensive?


Copy link to this message
-
Re: since flushes are batched, is it still io intensive?
S Ahmed 2012-05-11, 16:16
Do you tune the o/s dedicated memory for page cache?  Or that's all
automatic....

It would be cool if linkedin posted some of their server level tweaks if
that is critical to getting the most out of zero copy and kafka in general
:)

On Fri, May 11, 2012 at 12:10 PM, Jun Rao <[EMAIL PROTECTED]> wrote:

> Memory required for JVM is also low (2-4GB heap size). Most of the memory
> is used for pagecache.
>
> Jun
>
> On Fri, May 11, 2012 at 8:03 AM, S Ahmed <[EMAIL PROTECTED]> wrote:
>
> > What about memory?  I know you guys have 24GB of ram per server?
> >
> > Basically I'm juggling between going with a dedicated box (which has
> faster
> > IO), or ec2 which has slower IO but cheaper on the ram side (way
> cheaper!).
> >
> > On Fri, May 11, 2012 at 10:34 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
> >
> > > It all depends on the volume of the data. At LinkedIn, we observed that
> > the
> > > io load on a typical Kafka broker is not high.
> > >
> > > Jun
> > >
> > > On Fri, May 11, 2012 at 7:13 AM, S Ahmed <[EMAIL PROTECTED]> wrote:
> > >
> > > > I was thinking (and after doing some tests on dedicated and ec2),
> would
> > > you
> > > > still say kafka is io intensive?
> > > >
> > > > Considering writes are batched every x seconds, and you have a single
> > > kafka
> > > > server on a given instance, and consumers are just streaming the data
> > in
> > > > sequential order (the disk head isn't jumping around), is it safe to
> > say
> > > > kafka isn't that io intensive to the point that running it on ec2
> > should
> > > be
> > > > just as good as dedicated hardware?
> > > >
> > > > I was getting pretty good results on ec2 so this thought came to
> me...
> > > >
> > >
> >
>