|
|
-
since flushes are batched, is it still io intensive?
S Ahmed 2012-05-11, 14:13
I was thinking (and after doing some tests on dedicated and ec2), would you still say kafka is io intensive?
Considering writes are batched every x seconds, and you have a single kafka server on a given instance, and consumers are just streaming the data in sequential order (the disk head isn't jumping around), is it safe to say kafka isn't that io intensive to the point that running it on ec2 should be just as good as dedicated hardware?
I was getting pretty good results on ec2 so this thought came to me...
+
S Ahmed 2012-05-11, 14:13
-
Re: since flushes are batched, is it still io intensive?
Jun Rao 2012-05-11, 14:34
It all depends on the volume of the data. At LinkedIn, we observed that the io load on a typical Kafka broker is not high.
Jun
On Fri, May 11, 2012 at 7:13 AM, S Ahmed <[EMAIL PROTECTED]> wrote:
> I was thinking (and after doing some tests on dedicated and ec2), would you > still say kafka is io intensive? > > Considering writes are batched every x seconds, and you have a single kafka > server on a given instance, and consumers are just streaming the data in > sequential order (the disk head isn't jumping around), is it safe to say > kafka isn't that io intensive to the point that running it on ec2 should be > just as good as dedicated hardware? > > I was getting pretty good results on ec2 so this thought came to me... >
+
Jun Rao 2012-05-11, 14:34
-
Re: since flushes are batched, is it still io intensive?
S Ahmed 2012-05-11, 15:03
What about memory? I know you guys have 24GB of ram per server?
Basically I'm juggling between going with a dedicated box (which has faster IO), or ec2 which has slower IO but cheaper on the ram side (way cheaper!).
On Fri, May 11, 2012 at 10:34 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
> It all depends on the volume of the data. At LinkedIn, we observed that the > io load on a typical Kafka broker is not high. > > Jun > > On Fri, May 11, 2012 at 7:13 AM, S Ahmed <[EMAIL PROTECTED]> wrote: > > > I was thinking (and after doing some tests on dedicated and ec2), would > you > > still say kafka is io intensive? > > > > Considering writes are batched every x seconds, and you have a single > kafka > > server on a given instance, and consumers are just streaming the data in > > sequential order (the disk head isn't jumping around), is it safe to say > > kafka isn't that io intensive to the point that running it on ec2 should > be > > just as good as dedicated hardware? > > > > I was getting pretty good results on ec2 so this thought came to me... > > >
+
S Ahmed 2012-05-11, 15:03
-
Re: since flushes are batched, is it still io intensive?
Jun Rao 2012-05-11, 16:10
Memory required for JVM is also low (2-4GB heap size). Most of the memory is used for pagecache.
Jun
On Fri, May 11, 2012 at 8:03 AM, S Ahmed <[EMAIL PROTECTED]> wrote:
> What about memory? I know you guys have 24GB of ram per server? > > Basically I'm juggling between going with a dedicated box (which has faster > IO), or ec2 which has slower IO but cheaper on the ram side (way cheaper!). > > On Fri, May 11, 2012 at 10:34 AM, Jun Rao <[EMAIL PROTECTED]> wrote: > > > It all depends on the volume of the data. At LinkedIn, we observed that > the > > io load on a typical Kafka broker is not high. > > > > Jun > > > > On Fri, May 11, 2012 at 7:13 AM, S Ahmed <[EMAIL PROTECTED]> wrote: > > > > > I was thinking (and after doing some tests on dedicated and ec2), would > > you > > > still say kafka is io intensive? > > > > > > Considering writes are batched every x seconds, and you have a single > > kafka > > > server on a given instance, and consumers are just streaming the data > in > > > sequential order (the disk head isn't jumping around), is it safe to > say > > > kafka isn't that io intensive to the point that running it on ec2 > should > > be > > > just as good as dedicated hardware? > > > > > > I was getting pretty good results on ec2 so this thought came to me... > > > > > >
+
Jun Rao 2012-05-11, 16:10
-
Re: since flushes are batched, is it still io intensive?
S Ahmed 2012-05-11, 16:16
Do you tune the o/s dedicated memory for page cache? Or that's all automatic....
It would be cool if linkedin posted some of their server level tweaks if that is critical to getting the most out of zero copy and kafka in general :)
On Fri, May 11, 2012 at 12:10 PM, Jun Rao <[EMAIL PROTECTED]> wrote:
> Memory required for JVM is also low (2-4GB heap size). Most of the memory > is used for pagecache. > > Jun > > On Fri, May 11, 2012 at 8:03 AM, S Ahmed <[EMAIL PROTECTED]> wrote: > > > What about memory? I know you guys have 24GB of ram per server? > > > > Basically I'm juggling between going with a dedicated box (which has > faster > > IO), or ec2 which has slower IO but cheaper on the ram side (way > cheaper!). > > > > On Fri, May 11, 2012 at 10:34 AM, Jun Rao <[EMAIL PROTECTED]> wrote: > > > > > It all depends on the volume of the data. At LinkedIn, we observed that > > the > > > io load on a typical Kafka broker is not high. > > > > > > Jun > > > > > > On Fri, May 11, 2012 at 7:13 AM, S Ahmed <[EMAIL PROTECTED]> wrote: > > > > > > > I was thinking (and after doing some tests on dedicated and ec2), > would > > > you > > > > still say kafka is io intensive? > > > > > > > > Considering writes are batched every x seconds, and you have a single > > > kafka > > > > server on a given instance, and consumers are just streaming the data > > in > > > > sequential order (the disk head isn't jumping around), is it safe to > > say > > > > kafka isn't that io intensive to the point that running it on ec2 > > should > > > be > > > > just as good as dedicated hardware? > > > > > > > > I was getting pretty good results on ec2 so this thought came to > me... > > > > > > > > > >
+
S Ahmed 2012-05-11, 16:16
-
Re: since flushes are batched, is it still io intensive?
Jay Kreps 2012-05-11, 23:10
No Linux does the right thing by default. We have an operations page on the site that gives all the details on our setup but there is nothing setup.
-Jay
On Fri, May 11, 2012 at 9:16 AM, S Ahmed <[EMAIL PROTECTED]> wrote: > Do you tune the o/s dedicated memory for page cache? Or that's all > automatic.... > > It would be cool if linkedin posted some of their server level tweaks if > that is critical to getting the most out of zero copy and kafka in general > :) > > On Fri, May 11, 2012 at 12:10 PM, Jun Rao <[EMAIL PROTECTED]> wrote: > >> Memory required for JVM is also low (2-4GB heap size). Most of the memory >> is used for pagecache. >> >> Jun >> >> On Fri, May 11, 2012 at 8:03 AM, S Ahmed <[EMAIL PROTECTED]> wrote: >> >> > What about memory? I know you guys have 24GB of ram per server? >> > >> > Basically I'm juggling between going with a dedicated box (which has >> faster >> > IO), or ec2 which has slower IO but cheaper on the ram side (way >> cheaper!). >> > >> > On Fri, May 11, 2012 at 10:34 AM, Jun Rao <[EMAIL PROTECTED]> wrote: >> > >> > > It all depends on the volume of the data. At LinkedIn, we observed that >> > the >> > > io load on a typical Kafka broker is not high. >> > > >> > > Jun >> > > >> > > On Fri, May 11, 2012 at 7:13 AM, S Ahmed <[EMAIL PROTECTED]> wrote: >> > > >> > > > I was thinking (and after doing some tests on dedicated and ec2), >> would >> > > you >> > > > still say kafka is io intensive? >> > > > >> > > > Considering writes are batched every x seconds, and you have a single >> > > kafka >> > > > server on a given instance, and consumers are just streaming the data >> > in >> > > > sequential order (the disk head isn't jumping around), is it safe to >> > say >> > > > kafka isn't that io intensive to the point that running it on ec2 >> > should >> > > be >> > > > just as good as dedicated hardware? >> > > > >> > > > I was getting pretty good results on ec2 so this thought came to >> me... >> > > > >> > > >> > >>
+
Jay Kreps 2012-05-11, 23:10
-
Re: since flushes are batched, is it still io intensive?
S Ahmed 2012-05-12, 01:21
then why have an operations page? j/k
thanks Jay!
Just a note, and I hope nobody takes it the wrong way, but I was lookign at the flume project and I really appreciated how much comments they had in their code. Scala is already a bit cryptic, comments would go a long way for newbies :)
On Fri, May 11, 2012 at 7:10 PM, Jay Kreps <[EMAIL PROTECTED]> wrote:
> No Linux does the right thing by default. We have an operations page > on the site that gives all the details on our setup but there is > nothing setup. > > -Jay > > On Fri, May 11, 2012 at 9:16 AM, S Ahmed <[EMAIL PROTECTED]> wrote: > > Do you tune the o/s dedicated memory for page cache? Or that's all > > automatic.... > > > > It would be cool if linkedin posted some of their server level tweaks if > > that is critical to getting the most out of zero copy and kafka in > general > > :) > > > > On Fri, May 11, 2012 at 12:10 PM, Jun Rao <[EMAIL PROTECTED]> wrote: > > > >> Memory required for JVM is also low (2-4GB heap size). Most of the > memory > >> is used for pagecache. > >> > >> Jun > >> > >> On Fri, May 11, 2012 at 8:03 AM, S Ahmed <[EMAIL PROTECTED]> wrote: > >> > >> > What about memory? I know you guys have 24GB of ram per server? > >> > > >> > Basically I'm juggling between going with a dedicated box (which has > >> faster > >> > IO), or ec2 which has slower IO but cheaper on the ram side (way > >> cheaper!). > >> > > >> > On Fri, May 11, 2012 at 10:34 AM, Jun Rao <[EMAIL PROTECTED]> wrote: > >> > > >> > > It all depends on the volume of the data. At LinkedIn, we observed > that > >> > the > >> > > io load on a typical Kafka broker is not high. > >> > > > >> > > Jun > >> > > > >> > > On Fri, May 11, 2012 at 7:13 AM, S Ahmed <[EMAIL PROTECTED]> > wrote: > >> > > > >> > > > I was thinking (and after doing some tests on dedicated and ec2), > >> would > >> > > you > >> > > > still say kafka is io intensive? > >> > > > > >> > > > Considering writes are batched every x seconds, and you have a > single > >> > > kafka > >> > > > server on a given instance, and consumers are just streaming the > data > >> > in > >> > > > sequential order (the disk head isn't jumping around), is it safe > to > >> > say > >> > > > kafka isn't that io intensive to the point that running it on ec2 > >> > should > >> > > be > >> > > > just as good as dedicated hardware? > >> > > > > >> > > > I was getting pretty good results on ec2 so this thought came to > >> me... > >> > > > > >> > > > >> > > >> >
+
S Ahmed 2012-05-12, 01:21
|
|