Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Fully distribute TextInputFormat...


Copy link to this message
-
Re: Fully distribute TextInputFormat...
What's the format of this file ? gzip can been split.

On Mon, May 10, 2010 at 5:21 AM, Pierre ANCELOT <[EMAIL PROTECTED]> wrote:
> Hi folks :)
> I have one big file... I read it with FileInputFormat, this generates only
> one task and of course, this doesn't get distributed across the cluster
> nodes.
> Should I use an other Input class or do I have a bug in my implementation?
>
> The desired behavior is one task per line.
>
> Thanks.
>
>
>
> --
> http://www.neko-consulting.com
> Ego sum quis ego servo
> "Je suis ce que je protège"
> "I am what I protect"
>

--
Best Regards

Jeff Zhang
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB