Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Consuming "backwards"?


Copy link to this message
-
Re: Consuming "backwards"?
Philip O'Toole 2013-12-06, 15:59
Take apart the hard disk, and flip the magnets in the motors so it spins in reverse. The Kafka software won't be any the wiser. That should give you exactly what you need, combined with high-performance sequential reads.

:-D

> On Dec 6, 2013, at 7:43 AM, Joe Stein <[EMAIL PROTECTED]> wrote:
>
> hmmm, I just realized that wouldn't work actually (starting at the end is
> fine)... the fetch size being taken in is still going to increment forward
> ...
>
> The KafkaApi would have to change because in readMessageSet it is doing a
> log.read of the FileMessageSet ...
>
> it should be possible though but not without changing the way the log is
> read when getting the partition with ReplicaManager
>
> so let me take that all back and say... can't be done now but I think it is
> feasible to be done with some broker modifications to read the log
> differently... off the top of my head can't think of how to change the
> log.read to-do this without digging more down into the code
>
>
>
> /*******************************************
> Joe Stein
> Founder, Principal Consultant
> Big Data Open Source Security LLC
> http://www.stealth.ly
> Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
> ********************************************/
>
>
>> On Fri, Dec 6, 2013 at 10:26 AM, Joe Stein <[EMAIL PROTECTED]> wrote:
>>
>> The fetch requests are very flexible to-do what you want with them.
>>
>> Take a look at SimpleConsumerShell.scala as a reference
>>
>> You could pass in OffsetRequest.LatestTime (-1) with a fetch size of 3 and
>> then just keep doing that over and over again.
>>
>> I think that will do exactly what you are looking to-do.
>>
>> /*******************************************
>> Joe Stein
>> Founder, Principal Consultant
>> Big Data Open Source Security LLC
>> http://www.stealth.ly
>> Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
>> ********************************************/
>>
>>
>> On Fri, Dec 6, 2013 at 10:04 AM, Otis Gospodnetic <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Hi,
>>>
>>>> On Fri, Dec 6, 2013 at 9:38 AM, Tom Brown <[EMAIL PROTECTED]> wrote:
>>>>
>>>> Do you mean you want to start from the most recent data and go
>>> backwards to
>>>> the oldest data, or that you want to start with old data and consume
>>>> forwards?
>>>
>>> Forwards is the "normal way".  I'm looking for the "abnormal way", of
>>> course ;) i.e. backwards.
>>> If the following are the messages that came in, oldest to newest:
>>>
>>> M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 M12
>>>
>>> Then I'd love to be able to consume from the end, say in batches of 3,
>>> like
>>> this:
>>>
>>> get last 3: M10 M11 M12
>>> get last 3: M7 M8 M9
>>> get last 3: M4 M5 M6
>>> get last 3: M1 M2 M3
>>>
>>> Of course, if messages keep coming in, then the new ones that arrive would
>>> get picked up first and, eventually, assuming Consumer can consume faster
>>> than messages are produced, all messages will get consumed.
>>>
>>> But the important/key part is that any new ones that arrive will get
>>> picked
>>> up first.
>>>
>>> If the former, it would be difficult or impossible in 0.7.x, but I think
>>>> doable in 0.8.x. (They added some sort of message index). If the latter,
>>>> that is easily accomplished in both versions.
>>>
>>> I'd love to know if that's really so and how to do it!
>>>
>>> We are looking to move to Kafka 0.8 in January and to add performance
>>> monitoring for Kafka 0.8 to SPM (see
>>>
>>> http://blog.sematext.com/2013/10/16/announcement-spm-performance-monitoring-for-kafka/
>>> )
>>>
>>> Thanks,
>>> Otis
>>> --
>>> Performance Monitoring * Log Analytics * Search Analytics
>>> Solr & Elasticsearch Support * http://sematext.com/
>>>
>>>
>>>
>>>>> On Friday, December 6, 2013, Otis Gospodnetic wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Does Kafka offer a way to consume messages in batches, but "from the
>>>> end"?
>>>>>
>>>>> This would be valuable to have in all systems where the most recent