Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> the cleaner and log segments


Copy link to this message
-
Re: the cleaner and log segments
Was that "write an empty log segment" feature always there?

On 11/18/2011 06:39 PM, Joel Koshy wrote:
> Just want to see if I understand this right - when the log cleaner
> does its thing, even if all the segments are eligible for garbage
> collection the cleaner will nuke those files and should deposit an
> empty segment file named with the next valid offset in that partition.
> I think Taylor encountered a case where that empty segment was not
> added. Is this the race condition that you speak of? If for e.g., the
> broker crashes before that empty segment file is created...
>
> Also, I have seen the log cleaner act up more than once in the past -
> basically seems to get scheduled continuously and delete file 0000...
> I think someone else on the list saw that before. I have been unable
> to reproduce that though - and it is not impossible that there was a
> misconfiguration at play.
>
> Thanks,
>
> Joel
>
> On Fri, Nov 18, 2011 at 11:50 AM, Taylor Gautier <[EMAIL PROTECTED]> wrote:
>> Ok that's what we are already doing.  In essence when that happens it
>> is a bit like a rollover. Except depending on the values it might be
>> the case that a consumer has a low enough value that web it requests
>> the topic the value is still within range but is not valid since
>> messages were delivered to the broker. Essentially it's a race
>> condition that might be somewhat hard to induce but is theoretically
>> possible. During a rollover of 64-bits this is more or less never
>> going to happen because 64-bits is just too large to open a time
>> window long enough for the race to occur.
>>
>>
>>
>> On Nov 18, 2011, at 10:32 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
>>
>>> Taylor,
>>>
>>> If you request an offset whose corresponding log file has been deleted, you
>>> will get OutOfRange exception. When this happens, you can use the
>>> getLatestOffset api in SimpleConsumer to obtain either the current valid
>>> smallest or largest offset and reconsume from there. Our high level
>>> consumer does that for you (among many other things). That's why we
>>> encourage most users to use the high level api instead.
>>>
>>> Thanks,
>>>
>>> Jun
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB