Anthony Grimes 2013-02-22, 00:07
Eric Tschetter 2013-02-22, 00:30
Milind Parikh 2013-02-22, 00:44
Jay Kreps 2013-02-22, 01:26
graham sanderson 2013-02-22, 02:47
Eric Tschetter 2013-02-22, 21:39
Jay Kreps 2013-02-23, 04:13
Sounds good. Thanks for the input, kind sir!
Jay Kreps wrote:
> You can do this and it should work fine. You would have to keep adding
> machines to get disk capacity, of course, since your data set would
> only grow.
> We will keep an open file descriptor per file, but I think that is
> okay. Just set the segment size to 1GB, then with 10TB of storage that
> is only 10k files which should be fine. Adjust the OS open FD limit up
> a bit if needed. File descriptors don't use too much memory so this
> should not hurt anything.
> On Thu, Feb 21, 2013 at 4:00 PM, Anthony Grimes<[EMAIL PROTECTED]> wrote:
>> Our use case is that we'd like to log data we don't need away and
>> potentially replay it at some point. We don't want to delete old logs. I
>> googled around a bit and I only discovered this particular post:
>> http://mail-archives.apache.org/mod_mbox/incubator-kafka-users/201210.mbox/%[EMAIL PROTECTED]%3E
>> In summary, it appears the primary issue is that Kafka keeps file handles of
>> each log segment open. Is there a way to configure this, or is a way to do
>> so planned? It appears that an option to deduplicate instead of delete was
>> added recently, so doesn't the file handle issue exist with that as well
>> (since files aren't being deleted)?