|
|
-
Why pad transaction log files?
Vishal Kher 2011-05-25, 18:08
Hi,
I am working on a fix for ZOOKEEPER-1069.
While going through the Util.PadLogFile() method, it is not clear to me why this method is really needed. It will be nice if someone can clarify its advantage. private static final ByteBuffer fill = ByteBuffer.allocateDirect(1); [...] /** * Grows the file to the specified number of bytes. This only happenes if * the current file position is sufficiently close (less than 4K) to end of * file. * * @param f output stream to pad * @param currentSize application keeps track of the cuurent file size * @param preAllocSize how many bytes to pad * @return the new file size. It can be the same as currentSize if no * padding was done. * @throws IOException */ public static long padLogFile(FileOutputStream f,long currentSize, long preAllocSize) throws IOException{ long position = f.getChannel().position(); if (position + 4096 >= currentSize) { currentSize = currentSize + preAllocSize; fill.position(0); f.getChannel().write(fill, currentSize-fill.remaining()); } return currentSize; } It looks like the method was intended to *allocate* disk blocks in advance to a) try to get sequential allocation b) avoid allocation during writes. But the method is not doing that. It is just growing the file size. What is the advantage of that? It is not achieving any of the above advantages.
What am I missing here?
Thanks, -Vishal
+
Vishal Kher 2011-05-25, 18:08
-
Re: Why pad transaction log files?
Benjamin Reed 2011-05-25, 20:23
we found that just growing the file got us performance advantages.
ben
On Wed, May 25, 2011 at 11:08 AM, Vishal Kher <[EMAIL PROTECTED]> wrote: > Hi, > > I am working on a fix for ZOOKEEPER-1069. > > While going through the Util.PadLogFile() method, it is not clear to me why > this method is really needed. It will be nice if someone can clarify its > advantage. > > > private static final ByteBuffer fill = ByteBuffer.allocateDirect(1); > [...] > /** > * Grows the file to the specified number of bytes. This only happenes > if > * the current file position is sufficiently close (less than 4K) to end > of > * file. > * > * @param f output stream to pad > * @param currentSize application keeps track of the cuurent file size > * @param preAllocSize how many bytes to pad > * @return the new file size. It can be the same as currentSize if no > * padding was done. > * @throws IOException > */ > public static long padLogFile(FileOutputStream f,long currentSize, > long preAllocSize) throws IOException{ > long position = f.getChannel().position(); > if (position + 4096 >= currentSize) { > currentSize = currentSize + preAllocSize; > fill.position(0); > f.getChannel().write(fill, currentSize-fill.remaining()); > } > return currentSize; > } > > > It looks like the method was intended to *allocate* disk blocks in advance > to a) try to get sequential allocation b) avoid allocation during writes. > But the method is not doing that. It is just growing the file size. What is > the advantage of that? It is not achieving any of the above advantages. > > What am I missing here? > > Thanks, > -Vishal >
+
Benjamin Reed 2011-05-25, 20:23
-
Re: Why pad transaction log files?
Vishal Kher 2011-05-25, 21:04
Interesting. Do you remember why? What was the file system used during the test?
On Wed, May 25, 2011 at 4:23 PM, Benjamin Reed <[EMAIL PROTECTED]> wrote:
> we found that just growing the file got us performance advantages. > > ben > > On Wed, May 25, 2011 at 11:08 AM, Vishal Kher <[EMAIL PROTECTED]> > wrote: > > Hi, > > > > I am working on a fix for ZOOKEEPER-1069. > > > > While going through the Util.PadLogFile() method, it is not clear to me > why > > this method is really needed. It will be nice if someone can clarify its > > advantage. > > > > > > private static final ByteBuffer fill = ByteBuffer.allocateDirect(1); > > [...] > > /** > > * Grows the file to the specified number of bytes. This only happenes > > if > > * the current file position is sufficiently close (less than 4K) to > end > > of > > * file. > > * > > * @param f output stream to pad > > * @param currentSize application keeps track of the cuurent file size > > * @param preAllocSize how many bytes to pad > > * @return the new file size. It can be the same as currentSize if no > > * padding was done. > > * @throws IOException > > */ > > public static long padLogFile(FileOutputStream f,long currentSize, > > long preAllocSize) throws IOException{ > > long position = f.getChannel().position(); > > if (position + 4096 >= currentSize) { > > currentSize = currentSize + preAllocSize; > > fill.position(0); > > f.getChannel().write(fill, currentSize-fill.remaining()); > > } > > return currentSize; > > } > > > > > > It looks like the method was intended to *allocate* disk blocks in > advance > > to a) try to get sequential allocation b) avoid allocation during writes. > > But the method is not doing that. It is just growing the file size. What > is > > the advantage of that? It is not achieving any of the above advantages. > > > > What am I missing here? > > > > Thanks, > > -Vishal > > >
+
Vishal Kher 2011-05-25, 21:04
-
Re: Why pad transaction log files?
Benjamin Reed 2011-05-25, 21:15
we run on ext3. it makes a big difference.
ben
On Wed, May 25, 2011 at 2:04 PM, Vishal Kher <[EMAIL PROTECTED]> wrote: > Interesting. Do you remember why? What was the file system used during the > test? > > On Wed, May 25, 2011 at 4:23 PM, Benjamin Reed <[EMAIL PROTECTED]> wrote: > >> we found that just growing the file got us performance advantages. >> >> ben >> >> On Wed, May 25, 2011 at 11:08 AM, Vishal Kher <[EMAIL PROTECTED]> >> wrote: >> > Hi, >> > >> > I am working on a fix for ZOOKEEPER-1069. >> > >> > While going through the Util.PadLogFile() method, it is not clear to me >> why >> > this method is really needed. It will be nice if someone can clarify its >> > advantage. >> > >> > >> > private static final ByteBuffer fill = ByteBuffer.allocateDirect(1); >> > [...] >> > /** >> > * Grows the file to the specified number of bytes. This only happenes >> > if >> > * the current file position is sufficiently close (less than 4K) to >> end >> > of >> > * file. >> > * >> > * @param f output stream to pad >> > * @param currentSize application keeps track of the cuurent file size >> > * @param preAllocSize how many bytes to pad >> > * @return the new file size. It can be the same as currentSize if no >> > * padding was done. >> > * @throws IOException >> > */ >> > public static long padLogFile(FileOutputStream f,long currentSize, >> > long preAllocSize) throws IOException{ >> > long position = f.getChannel().position(); >> > if (position + 4096 >= currentSize) { >> > currentSize = currentSize + preAllocSize; >> > fill.position(0); >> > f.getChannel().write(fill, currentSize-fill.remaining()); >> > } >> > return currentSize; >> > } >> > >> > >> > It looks like the method was intended to *allocate* disk blocks in >> advance >> > to a) try to get sequential allocation b) avoid allocation during writes. >> > But the method is not doing that. It is just growing the file size. What >> is >> > the advantage of that? It is not achieving any of the above advantages. >> > >> > What am I missing here? >> > >> > Thanks, >> > -Vishal >> > >> >
+
Benjamin Reed 2011-05-25, 21:15
-
Re: Why pad transaction log files?
Benjamin Reed 2011-05-25, 21:16
i think one of the reasons is that you don't have to keep updating the file size as you are appending, so you avoid seeks.
ben
On Wed, May 25, 2011 at 2:04 PM, Vishal Kher <[EMAIL PROTECTED]> wrote: > Interesting. Do you remember why? What was the file system used during the > test? > > On Wed, May 25, 2011 at 4:23 PM, Benjamin Reed <[EMAIL PROTECTED]> wrote: > >> we found that just growing the file got us performance advantages. >> >> ben >> >> On Wed, May 25, 2011 at 11:08 AM, Vishal Kher <[EMAIL PROTECTED]> >> wrote: >> > Hi, >> > >> > I am working on a fix for ZOOKEEPER-1069. >> > >> > While going through the Util.PadLogFile() method, it is not clear to me >> why >> > this method is really needed. It will be nice if someone can clarify its >> > advantage. >> > >> > >> > private static final ByteBuffer fill = ByteBuffer.allocateDirect(1); >> > [...] >> > /** >> > * Grows the file to the specified number of bytes. This only happenes >> > if >> > * the current file position is sufficiently close (less than 4K) to >> end >> > of >> > * file. >> > * >> > * @param f output stream to pad >> > * @param currentSize application keeps track of the cuurent file size >> > * @param preAllocSize how many bytes to pad >> > * @return the new file size. It can be the same as currentSize if no >> > * padding was done. >> > * @throws IOException >> > */ >> > public static long padLogFile(FileOutputStream f,long currentSize, >> > long preAllocSize) throws IOException{ >> > long position = f.getChannel().position(); >> > if (position + 4096 >= currentSize) { >> > currentSize = currentSize + preAllocSize; >> > fill.position(0); >> > f.getChannel().write(fill, currentSize-fill.remaining()); >> > } >> > return currentSize; >> > } >> > >> > >> > It looks like the method was intended to *allocate* disk blocks in >> advance >> > to a) try to get sequential allocation b) avoid allocation during writes. >> > But the method is not doing that. It is just growing the file size. What >> is >> > the advantage of that? It is not achieving any of the above advantages. >> > >> > What am I missing here? >> > >> > Thanks, >> > -Vishal >> > >> >
+
Benjamin Reed 2011-05-25, 21:16
|
|