Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Hadoop Real time help


Copy link to this message
-
Re: Hadoop Real time help
Bertrand Dechoux 2012-08-20, 07:37
The terms are
* ESP : http://en.wikipedia.org/wiki/Event_stream_processing
* CEP : http://en.wikipedia.org/wiki/Complex_event_processing

By the way, processing streams in real time tends toward being a pleonasm.

MapReduce follows a batch architecture. You keep data until a given time.
You then process everything. And at the end you provide all the results.
Stream processing has by definition a more 'smooth' throughput. Each event
is processed at a time and potentially each processing could lead to a
result.

I don't know any complete overview of such tools.
Esper is well known in that space.
FlumeBase was an attempt to do something similar (as far as I can tell).
It shows how an ESP engine fits with log collection using a tool such as
Flume.

Then you also have other solutions which will allow you to scale such as
Storm.
A few people have already considered using Storm for scalability and Esper
to do the real computation.

Regards

Bertrand

On Sun, Aug 19, 2012 at 9:44 PM, Niels Basjes <[EMAIL PROTECTED]> wrote:

> Is there a "complete" overview of the tools that allow processing streams
> of data in realtime?
>
> Or even better; what are the terms to google for?
>
> --
> Met vriendelijke groet,
> Niels Basjes
> (Verstuurd vanaf mobiel )
> Op 19 aug. 2012 18:22 schreef "Bertrand Dechoux" <[EMAIL PROTECTED]> het
> volgende:
>
> That's a good question. More and more people are talking about Hadoop Real
>> Time.
>> One key aspect of this question is whether we are talking about MapReduce
>> or not.
>>
>> MapReduce greatly improves the response time of any data intensive jobs
>> but it is still a batch framework with a noticeable latency.
>>
>> There is multiple ways to improve the latency :
>> * ESP/CEP solutions (like Esper, FlumeBase, ...)
>> * Big Table clones (like HBase ...)
>> * YARN with a non MapReduce application
>> * ...
>>
>> But it will really depend on the context and the definition of 'real
>> time'.
>>
>> Regards
>>
>> Bertrand
>>
>>
>>
>> On Sun, Aug 19, 2012 at 5:44 PM, mahout user <[EMAIL PROTECTED]>wrote:
>>
>>> Hello folks,
>>>
>>>
>>>    I am new to hadoop, I just want to get information that how hadoop
>>> framework is usefull for real time service.?can any one explain me..?
>>>
>>> Thanks.
>>>
>>
>>
>>
>> --
>> Bertrand Dechoux
>>
>
--
Bertrand Dechoux