Romeo Kienzler 2011-11-22, 07:01
-Re: Incremental Mappers?
Joey Echeverria 2011-11-22, 11:32
You're correct, currently HDFS only supports reading from closed files. You can configure flume to write your data in small enough chunks so you can do incremental processing.
On Nov 22, 2011, at 2:01, Romeo Kienzler <[EMAIL PROTECTED]> wrote:
> I'm planning to use Fume in order to stream data from a local client machine into HDFS running on a cloud environment.
> Is there a way to start a mapper already on an incomplete file? As I know a file in HDFS has to be closed first before a mapper can start.
> Is this true?
> Any possible idea for a solution of this problem?
> Or do I have to write smaller chunks of my big input file and create multiple files in HDFS and start a separate map task on each file once it has been closed?
> Best Regards,
> Romeo Kienzler
> r o m e o @ o r m i u m . d e