Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Incremental Mappers?


Copy link to this message
-
Re: Incremental Mappers?
Joey Echeverria 2011-11-22, 11:32
You're correct, currently HDFS only supports reading from closed files. You can configure flume to write your data in small enough chunks so you can do incremental processing.
-Joey

On Nov 22, 2011, at 2:01, Romeo Kienzler <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I'm planning to use Fume in order to stream data from a local client machine into HDFS running on a cloud environment.
>
> Is there a way to start a mapper already on an incomplete file? As I know a file in HDFS has to be closed first before a mapper can start.
>
> Is this true?
>
> Any possible idea for a solution of this problem?
>
> Or do I have to write smaller chunks of my big input file and create multiple files in HDFS and start a separate map task on each file once it has been closed?
>
> Best Regards,
>
> Romeo
>
> Romeo Kienzler
> r o m e o @ o r m i u m . d e