Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume, mail # user - Analysis of Data


+
Surindhar 2013-02-07, 09:52
+
Nitin Pawar 2013-02-07, 10:15
+
Surindhar 2013-02-07, 10:24
+
Bertrand Dechoux 2013-02-07, 10:30
+
Inder Pall 2013-02-07, 10:39
+
Mike Percy 2013-02-07, 10:59
+
Nitin Pawar 2013-02-07, 11:22
+
Steven Yates 2013-02-07, 23:04
+
Mike Percy 2013-02-08, 03:00
Copy link to this message
-
Re: Analysis of Data
Mike Percy 2013-02-08, 02:46
Thanks for replying Nitin. My thoughts inline:

On Thu, Feb 7, 2013 at 3:22 AM, Nitin Pawar <[EMAIL PROTECTED]> wrote:

> 1) Flume is isolated distributed system in the sense one agent does not
> idea about any other agent
>
Avro sinks know about downstream Avro sources, so basically it's a digraph,
right?

2) Flume in the sense when needs to collect data from multiple references
> and work across different data sets, it may not have the entire data set
> needed
>
I see what you are saying, however that is often the case with a streaming
data processing system, right?

3) let us assume we have required data on agents for processing it in
>  batches, do we really want to pressurize a live production server for data
> processing which can be done by systems like storm or hadoop or other
> system?
>
The data can be sent to downstream hops so there is no need to do data
processing on the application tier.

these are my ideas .. i can be totally wrong but just from systems point of
> view it looks good option to keep data acquisition separate from data
> processing and then storing the processed data for further data serving
>
In theory I agree with you, but because Flume can pipe data to downstream
agents who can do the heavy processing, it seems to me that this
requirement is easily fulfilled by Flume.

Regards,
Mike
On Thu, Feb 7, 2013 at 4:29 PM, Mike Percy <[EMAIL PROTECTED]> wrote:
>
>> Let's take this conversation further. What is missing?
>>
>>
>> On Thu, Feb 7, 2013 at 2:39 AM, Inder Pall <[EMAIL PROTECTED]> wrote:
>>
>>> flume is a platform to get events to the right sink (HDFS, local-file,
>>> ....)
>>> analytics is not something which falls in it's territory
>>>
>>> - Inder
>>>
>>>
>>> On Thu, Feb 7, 2013 at 3:22 PM, Surindhar <[EMAIL PROTECTED]> wrote:
>>>
>>>> Hi,
>>>>
>>>> Does Flume supports Analysis of Data?
>>>>
>>>> Br,
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> - Inder
>>> "You are average of the 5 people you spend the most time with"
>>>
>>
>>
>
>
> --
> Nitin Pawar
>
+
Steve Yates 2013-02-08, 03:22
+
Nitin Pawar 2013-02-08, 04:55
+
Inder Pall 2013-02-08, 08:48
+
Mike Percy 2013-02-08, 08:56
+
Nitin Pawar 2013-02-08, 09:45
+
syates@... 2013-02-08, 11:34
+
Mike Percy 2013-02-08, 22:09
+
Steven Yates 2013-02-10, 09:00
+
Steven Yates 2013-02-08, 10:45