On Tue, Dec 4, 2012 at 2:04 AM, Emile Kao <[EMAIL PROTECTED]> wrote:
> 1. Which is the best way to implement such a scenario using Flume/ Hadoop?
You could use the file spooling client / source to stream these files back
in the latest trunk and upcoming Flume 1.3.0 builds, along with hdfs sink.
2. The customer would like to keep the log files in thier original state
> (file name, size, etc..). Is it practicable using Flume?
Not recommended. Flume is an event streaming system, not a file copying
mechanism. If you want to do that, just use some scripts with hadoop fs
-put instead of Flume. Flume provides a bunch of stream-oriented features
on top of its event streaming architecture, such as data enrichment
capabilities, event routing, and configurable file rolling on HDFS, to name