|
|
-
where does the in-memory log to file persistence take place
S Ahmed 2012-05-07, 19:12
I can barely read scala, but I'm curious where the applications performs the operation of taking the in-memory log and persisting it to the database, all the while accepting new log messages and removing the keys of the messages that have been persisted to disk.
I'm guessing you have used the concurrenthashmap where the key is a topic, and once the flush timeout has been reached a background thread will somehow persist and remove the keys.
+
S Ahmed 2012-05-07, 19:12
-
Re: where does the in-memory log to file persistence take place
Neha Narkhede 2012-05-07, 19:14
Ahmed,
The related code is in kafka.log.*. The message to file persistence is inside FileMessageSet.scala.
Thanks, Neha
On Mon, May 7, 2012 at 12:12 PM, S Ahmed <[EMAIL PROTECTED]> wrote:
> I can barely read scala, but I'm curious where the applications performs > the operation of taking the in-memory log and persisting it to the > database, all the while accepting new log messages and removing the keys of > the messages that have been persisted to disk. > > I'm guessing you have used the concurrenthashmap where the key is a topic, > and once the flush timeout has been reached a background thread will > somehow persist and remove the keys. >
+
Neha Narkhede 2012-05-07, 19:14
-
Re: where does the in-memory log to file persistence take place
S Ahmed 2012-05-08, 20:44
Slowly trying to understand it, have to wramp up on my scala.
When the flush/sink occurrs, does it pull items of the collection 1 by 1 or does it do this in bulk somehow while locking the collection?
On Mon, May 7, 2012 at 3:14 PM, Neha Narkhede <[EMAIL PROTECTED]>wrote:
> Ahmed, > > The related code is in kafka.log.*. The message to file persistence is > inside FileMessageSet.scala. > > Thanks, > Neha > > On Mon, May 7, 2012 at 12:12 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > > > I can barely read scala, but I'm curious where the applications performs > > the operation of taking the in-memory log and persisting it to the > > database, all the while accepting new log messages and removing the keys > of > > the messages that have been persisted to disk. > > > > I'm guessing you have used the concurrenthashmap where the key is a > topic, > > and once the flush timeout has been reached a background thread will > > somehow persist and remove the keys. > > >
+
S Ahmed 2012-05-08, 20:44
-
Re: where does the in-memory log to file persistence take place
Jay Kreps 2012-05-08, 21:38
filechannel.force() always fully syncs the file to disk. This is done irrespective of message boundaries. The file is locked during this time so other appends are blocked.
-Jay
On Tue, May 8, 2012 at 1:44 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > Slowly trying to understand it, have to wramp up on my scala. > > When the flush/sink occurrs, does it pull items of the collection 1 by 1 or > does it do this in bulk somehow while locking the collection? > > On Mon, May 7, 2012 at 3:14 PM, Neha Narkhede <[EMAIL PROTECTED]>wrote: > >> Ahmed, >> >> The related code is in kafka.log.*. The message to file persistence is >> inside FileMessageSet.scala. >> >> Thanks, >> Neha >> >> On Mon, May 7, 2012 at 12:12 PM, S Ahmed <[EMAIL PROTECTED]> wrote: >> >> > I can barely read scala, but I'm curious where the applications performs >> > the operation of taking the in-memory log and persisting it to the >> > database, all the while accepting new log messages and removing the keys >> of >> > the messages that have been persisted to disk. >> > >> > I'm guessing you have used the concurrenthashmap where the key is a >> topic, >> > and once the flush timeout has been reached a background thread will >> > somehow persist and remove the keys. >> > >>
+
Jay Kreps 2012-05-08, 21:38
-
Re: where does the in-memory log to file persistence take place
S Ahmed 2012-05-10, 04:35
couldn't this be written in a way that it doesn't block? or limits the time, like it makes a copy of the reference, then replaces it when a newed up channel (synchronized).
Oh, this isn't possible because the var is mapped to a file at the o/s level?
On Tue, May 8, 2012 at 5:38 PM, Jay Kreps <[EMAIL PROTECTED]> wrote:
> filechannel.force() always fully syncs the file to disk. This is done > irrespective of message boundaries. The file is locked during this > time so other appends are blocked. > > -Jay > > On Tue, May 8, 2012 at 1:44 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > > Slowly trying to understand it, have to wramp up on my scala. > > > > When the flush/sink occurrs, does it pull items of the collection 1 by 1 > or > > does it do this in bulk somehow while locking the collection? > > > > On Mon, May 7, 2012 at 3:14 PM, Neha Narkhede <[EMAIL PROTECTED] > >wrote: > > > >> Ahmed, > >> > >> The related code is in kafka.log.*. The message to file persistence is > >> inside FileMessageSet.scala. > >> > >> Thanks, > >> Neha > >> > >> On Mon, May 7, 2012 at 12:12 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > >> > >> > I can barely read scala, but I'm curious where the applications > performs > >> > the operation of taking the in-memory log and persisting it to the > >> > database, all the while accepting new log messages and removing the > keys > >> of > >> > the messages that have been persisted to disk. > >> > > >> > I'm guessing you have used the concurrenthashmap where the key is a > >> topic, > >> > and once the flush timeout has been reached a background thread will > >> > somehow persist and remove the keys. > >> > > >> >
+
S Ahmed 2012-05-10, 04:35
-
Re: where does the in-memory log to file persistence take place
Neha Narkhede 2012-05-10, 19:01
>> couldn't this be written in a way that it doesn't block?
This can't be done since file channels are blocking in nature.
Thanks, Neha
On Wed, May 9, 2012 at 9:35 PM, S Ahmed <[EMAIL PROTECTED]> wrote:
> couldn't this be written in a way that it doesn't block? or limits the > time, like it makes a copy of the reference, then replaces it when a newed > up channel (synchronized). > > Oh, this isn't possible because the var is mapped to a file at the o/s > level? > > On Tue, May 8, 2012 at 5:38 PM, Jay Kreps <[EMAIL PROTECTED]> wrote: > > > filechannel.force() always fully syncs the file to disk. This is done > > irrespective of message boundaries. The file is locked during this > > time so other appends are blocked. > > > > -Jay > > > > On Tue, May 8, 2012 at 1:44 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > > > Slowly trying to understand it, have to wramp up on my scala. > > > > > > When the flush/sink occurrs, does it pull items of the collection 1 by > 1 > > or > > > does it do this in bulk somehow while locking the collection? > > > > > > On Mon, May 7, 2012 at 3:14 PM, Neha Narkhede <[EMAIL PROTECTED] > > >wrote: > > > > > >> Ahmed, > > >> > > >> The related code is in kafka.log.*. The message to file persistence is > > >> inside FileMessageSet.scala. > > >> > > >> Thanks, > > >> Neha > > >> > > >> On Mon, May 7, 2012 at 12:12 PM, S Ahmed <[EMAIL PROTECTED]> > wrote: > > >> > > >> > I can barely read scala, but I'm curious where the applications > > performs > > >> > the operation of taking the in-memory log and persisting it to the > > >> > database, all the while accepting new log messages and removing the > > keys > > >> of > > >> > the messages that have been persisted to disk. > > >> > > > >> > I'm guessing you have used the concurrenthashmap where the key is a > > >> topic, > > >> > and once the flush timeout has been reached a background thread will > > >> > somehow persist and remove the keys. > > >> > > > >> > > >
+
Neha Narkhede 2012-05-10, 19:01
-
Re: where does the in-memory log to file persistence take place
Jay Kreps 2012-05-10, 22:47
Our current understanding is that there is synchronization in linux so the underlying fsync and write calls actually block each other so even if you have a separate thread do the flush it doesn't help. We would be interested in finding a way to work around this. Good blog on this: http://antirez.com/post/fsync-different-thread-useless.html-Jay On Wed, May 9, 2012 at 9:35 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > couldn't this be written in a way that it doesn't block? or limits the > time, like it makes a copy of the reference, then replaces it when a newed > up channel (synchronized). > > Oh, this isn't possible because the var is mapped to a file at the o/s > level? > > On Tue, May 8, 2012 at 5:38 PM, Jay Kreps <[EMAIL PROTECTED]> wrote: > >> filechannel.force() always fully syncs the file to disk. This is done >> irrespective of message boundaries. The file is locked during this >> time so other appends are blocked. >> >> -Jay >> >> On Tue, May 8, 2012 at 1:44 PM, S Ahmed <[EMAIL PROTECTED]> wrote: >> > Slowly trying to understand it, have to wramp up on my scala. >> > >> > When the flush/sink occurrs, does it pull items of the collection 1 by 1 >> or >> > does it do this in bulk somehow while locking the collection? >> > >> > On Mon, May 7, 2012 at 3:14 PM, Neha Narkhede <[EMAIL PROTECTED] >> >wrote: >> > >> >> Ahmed, >> >> >> >> The related code is in kafka.log.*. The message to file persistence is >> >> inside FileMessageSet.scala. >> >> >> >> Thanks, >> >> Neha >> >> >> >> On Mon, May 7, 2012 at 12:12 PM, S Ahmed <[EMAIL PROTECTED]> wrote: >> >> >> >> > I can barely read scala, but I'm curious where the applications >> performs >> >> > the operation of taking the in-memory log and persisting it to the >> >> > database, all the while accepting new log messages and removing the >> keys >> >> of >> >> > the messages that have been persisted to disk. >> >> > >> >> > I'm guessing you have used the concurrenthashmap where the key is a >> >> topic, >> >> > and once the flush timeout has been reached a background thread will >> >> > somehow persist and remove the keys. >> >> > >> >> >>
+
Jay Kreps 2012-05-10, 22:47
-
Re: where does the in-memory log to file persistence take place
Jay Kreps 2012-05-10, 23:20
Here was the JIRA: https://issues.apache.org/jira/browse/KAFKA-191-Jay On Thu, May 10, 2012 at 3:47 PM, Jay Kreps <[EMAIL PROTECTED]> wrote: > Our current understanding is that there is synchronization in linux so > the underlying fsync and write calls actually block each other so even > if you have a separate thread do the flush it doesn't help. We would > be interested in finding a way to work around this. > > Good blog on this: > http://antirez.com/post/fsync-different-thread-useless.html> > -Jay > > On Wed, May 9, 2012 at 9:35 PM, S Ahmed <[EMAIL PROTECTED]> wrote: >> couldn't this be written in a way that it doesn't block? or limits the >> time, like it makes a copy of the reference, then replaces it when a newed >> up channel (synchronized). >> >> Oh, this isn't possible because the var is mapped to a file at the o/s >> level? >> >> On Tue, May 8, 2012 at 5:38 PM, Jay Kreps <[EMAIL PROTECTED]> wrote: >> >>> filechannel.force() always fully syncs the file to disk. This is done >>> irrespective of message boundaries. The file is locked during this >>> time so other appends are blocked. >>> >>> -Jay >>> >>> On Tue, May 8, 2012 at 1:44 PM, S Ahmed <[EMAIL PROTECTED]> wrote: >>> > Slowly trying to understand it, have to wramp up on my scala. >>> > >>> > When the flush/sink occurrs, does it pull items of the collection 1 by 1 >>> or >>> > does it do this in bulk somehow while locking the collection? >>> > >>> > On Mon, May 7, 2012 at 3:14 PM, Neha Narkhede <[EMAIL PROTECTED] >>> >wrote: >>> > >>> >> Ahmed, >>> >> >>> >> The related code is in kafka.log.*. The message to file persistence is >>> >> inside FileMessageSet.scala. >>> >> >>> >> Thanks, >>> >> Neha >>> >> >>> >> On Mon, May 7, 2012 at 12:12 PM, S Ahmed <[EMAIL PROTECTED]> wrote: >>> >> >>> >> > I can barely read scala, but I'm curious where the applications >>> performs >>> >> > the operation of taking the in-memory log and persisting it to the >>> >> > database, all the while accepting new log messages and removing the >>> keys >>> >> of >>> >> > the messages that have been persisted to disk. >>> >> > >>> >> > I'm guessing you have used the concurrenthashmap where the key is a >>> >> topic, >>> >> > and once the flush timeout has been reached a background thread will >>> >> > somehow persist and remove the keys. >>> >> > >>> >> >>>
+
Jay Kreps 2012-05-10, 23:20
|
|