Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Incremental Mappers?


+
Romeo Kienzler 2011-11-22, 07:01
Copy link to this message
-
Re: Incremental Mappers?
You're correct, currently HDFS only supports reading from closed files. You can configure flume to write your data in small enough chunks so you can do incremental processing.
-Joey

On Nov 22, 2011, at 2:01, Romeo Kienzler <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I'm planning to use Fume in order to stream data from a local client machine into HDFS running on a cloud environment.
>
> Is there a way to start a mapper already on an incomplete file? As I know a file in HDFS has to be closed first before a mapper can start.
>
> Is this true?
>
> Any possible idea for a solution of this problem?
>
> Or do I have to write smaller chunks of my big input file and create multiple files in HDFS and start a separate map task on each file once it has been closed?
>
> Best Regards,
>
> Romeo
>
> Romeo Kienzler
> r o m e o @ o r m i u m . d e
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB