Consider NLineInputFormat. -C
On Fri, Dec 4, 2009 at 5:34 PM, Ted Xu <[EMAIL PROTECTED]> wrote:
> Hi Daniel,
> I think there are better solutions, but simply chop the input file into
> pieces ( i.e. 10 urls per file ) shall work.
> 2009/12/4 Daniel Garcia <[EMAIL PROTECTED]>
>> I'm trying to rewrite an image resizing program in terms of
>> map/reduce. The problem I see is that the job is not broken up in to small
>> enough tasks. If I only have 1 input file with 10,000 urls (the file is much
>> less than the HDFS block size) how can I ensure that the job is distributed
>> amongst all the nodes. In other words how can I ensure that the task size is
>> small enough so that all nodes process a proportional size of the input.
> Best Regards,
> Tex Xu