Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Keeping logs forever

Copy link to this message
Re: Re: Keeping logs forever
Sounds good. Thanks for the input, kind sir!

Jay Kreps wrote:
> You can do this and it should work fine. You would have to keep adding
> machines to get disk capacity, of course, since your data set would
> only grow.
> We will keep an open file descriptor per file, but I think that is
> okay. Just set the segment size to 1GB, then with 10TB of storage that
> is only 10k files which should be fine. Adjust the OS open FD limit up
> a bit if needed. File descriptors don't use too much memory so this
> should not hurt anything.
> -Jay
> On Thu, Feb 21, 2013 at 4:00 PM, Anthony Grimes<[EMAIL PROTECTED]>  wrote:
>> Our use case is that we'd like to log data we don't need away and
>> potentially replay it at some point. We don't want to delete old logs. I
>> googled around a bit and I only discovered this particular post:
>> http://mail-archives.apache.org/mod_mbox/incubator-kafka-users/201210.mbox/%[EMAIL PROTECTED]%3E
>> In summary, it appears the primary issue is that Kafka keeps file handles of
>> each log segment open. Is there a way to configure this, or is a way to do
>> so planned? It appears that an option to deduplicate instead of delete was
>> added recently, so doesn't the file handle issue exist with that as well
>> (since files aren't being deleted)?