In some ways the TFile is close to SequenceFiles.
On Fri, Apr 20, 2012 at 8:19 PM, maninder batth
<[EMAIL PROTECTED]> wrote:
> My requirements are to save variable sized binary records and ability to
> query them later on. So i was looking at Tfile and had some doubts.
> 1. Is the datablock in the tfile a fixed size or variable size? If it is
> fixed, what happens when a record cannot fit in the datablock? Would you
> normally fill the empty space with zeros or spread the record over 2
> 2. Is there any downside of having a variable sized datablocks?
The condition for creation of a data block is only if the current size
of the block (at end of an append) is >= min-size-of-block.
Hence the data block isn't "fixed" in size. So if there's still space,
another record's written and then the condition is checked (which
would then trigger a block completion).
> 3. Are the records synced with file at the boundary of a datablock or they
> just written to file system. The question is like write() call in linux vs
Unsure what you mean by a "datablock" here. The TFiles don't work at
the FS level, and the "datablocks" in it are logical. Could you
clarify this question given (1) and (2)?