Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> CallbackHandler in Kafka 0.8


Copy link to this message
-
Re: CallbackHandler in Kafka 0.8
Hi Nitin,
> We receive events from external source ( for example, facebook status
> update events). These events are pushed to kafka queue when received There
> is possibility of duplicate event ( multiple facebook status update events
> for same account  in quick intervals ) coming again and gets pushed into
> kafka queue. At consumer end, we do not want to process duplicate events (
> connect to facebook and fetch the status). We would prefer not to have
> another data structure to  single instance the events received. There is
> time limit in which whatever events received, we want to do single
> instancing ( at once fetch all the status update events received in five
> minutes for single account ). In async producer, events are anyways not
> written to broker synchronously. They are batched and then pushed after
> predefined time interval. That's perfect for us. We just want to look at
> the batch and delete duplicate events from it and then push it broker.

However, this depends on your event rate - i.e., if it's a very high
event rate then the batch threshold is reached and send happens before
that time interval. Furthermore, even if event rate is low, the
de-duplication applies to a time-window of at most
queue.buffering.max.ms).

In 0.7, batches would be taken and the call-back handler's
beforeSendingData method would be invoked on those batches. i.e., you
can achieve the same effect as you did in 0.7 by keeping keeping an
additional buffer (before the actual send) of
queue.buffering.max.messages (i.e., the batch size) and de-duping
within that buffer.

Thanks,

Joel

> On Wed, Jul 3, 2013 at 3:07 AM, Joel Koshy <[EMAIL PROTECTED]> wrote:
>
>> Callback handlers are no longer supported in 0.8. Can you go into why
>> the filtering needs to be done at this stage as opposed to before
>> actually sending to the producer?
>>
>> Thanks,
>>
>> Joel
>>
>> On Tue, Jul 2, 2013 at 10:41 AM, Nitin Supekar <[EMAIL PROTECTED]> wrote:
>> > Hello-
>> >
>> >    Is CallbackHandler supported in Kafka 0.8 for async producers?
>> >
>> > If yes, can I use it to alter the batched messages before they are pushed
>> > to broker? For example, I may want to delete some of the messages in the
>> > batch based on some business logic in my application?
>> >
>> > If no, is there any alternate way? I want to do some kind of single
>> > instancing on messages pushed in kafka in last X minutes.
>> >
>> > thanks
>>
On Wed, Jul 3, 2013 at 1:33 AM, Nitin Supekar <[EMAIL PROTECTED]> wrote:
> Hello-
>
> We receive events from external source ( for example, facebook status
> update events). These events are pushed to kafka queue when received There
> is possibility of duplicate event ( multiple facebook status update events
> for same account  in quick intervals ) coming again and gets pushed into
> kafka queue. At consumer end, we do not want to process duplicate events (
> connect to facebook and fetch the status). We would prefer not to have
> another data structure to  single instance the events received. There is
> time limit in which whatever events received, we want to do single
> instancing ( at once fetch all the status update events received in five
> minutes for single account ). In async producer, events are anyways not
> written to broker synchronously. They are batched and then pushed after
> predefined time interval. That's perfect for us. We just want to look at
> the batch and delete duplicate events from it and then push it broker.
>
> Possible?
>
> thanks
>
>
> On Wed, Jul 3, 2013 at 3:07 AM, Joel Koshy <[EMAIL PROTECTED]> wrote:
>
>> Callback handlers are no longer supported in 0.8. Can you go into why
>> the filtering needs to be done at this stage as opposed to before
>> actually sending to the producer?
>>
>> Thanks,
>>
>> Joel
>>
>> On Tue, Jul 2, 2013 at 10:41 AM, Nitin Supekar <[EMAIL PROTECTED]> wrote:
>> > Hello-
>> >
>> >    Is CallbackHandler supported in Kafka 0.8 for async producers?
>> >
>> > If yes, can I use it to alter the batched messages before they are pushed