Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka, mail # user - Keeping logs forever


+
Anthony Grimes 2013-02-22, 00:07
+
Eric Tschetter 2013-02-22, 00:30
+
Milind Parikh 2013-02-22, 00:44
+
Jay Kreps 2013-02-22, 01:26
+
graham sanderson 2013-02-22, 02:47
+
Eric Tschetter 2013-02-22, 21:39
+
Jay Kreps 2013-02-23, 04:13
Copy link to this message
-
Re: Re: Keeping logs forever
Anthony Grimes 2013-02-22, 05:33
Sounds good. Thanks for the input, kind sir!

Jay Kreps wrote:
> You can do this and it should work fine. You would have to keep adding
> machines to get disk capacity, of course, since your data set would
> only grow.
>
> We will keep an open file descriptor per file, but I think that is
> okay. Just set the segment size to 1GB, then with 10TB of storage that
> is only 10k files which should be fine. Adjust the OS open FD limit up
> a bit if needed. File descriptors don't use too much memory so this
> should not hurt anything.
>
> -Jay
>
> On Thu, Feb 21, 2013 at 4:00 PM, Anthony Grimes<[EMAIL PROTECTED]>  wrote:
>> Our use case is that we'd like to log data we don't need away and
>> potentially replay it at some point. We don't want to delete old logs. I
>> googled around a bit and I only discovered this particular post:
>> http://mail-archives.apache.org/mod_mbox/incubator-kafka-users/201210.mbox/%[EMAIL PROTECTED]%3E
>>
>> In summary, it appears the primary issue is that Kafka keeps file handles of
>> each log segment open. Is there a way to configure this, or is a way to do
>> so planned? It appears that an option to deduplicate instead of delete was
>> added recently, so doesn't the file handle issue exist with that as well
>> (since files aren't being deleted)?