I have faced a similar issue, In the job the block size is 64MB and the no
of the maps created is 656 and the no of files uploaded to HDFS is 656 and
its each file size is 11MB. I assume that if small files exist it will not
able to group.
Could kindly clarify it?
On Fri, Jul 6, 2012 at 10:30 PM, Robert Evans <[EMAIL PROTECTED]> wrote:
> How long a program takes to run depends on a lot of things. It could be a
> connectivity issue, or it could be that your program does a lot more
> processing for some input records then for others, or it could be that some
> of your records are a lot smaller so that more of them exist in a single
> input split. Without knowing what the code is doing it is hard to say
> more then that.
> --Bobby Evans
> From: Kasi Subrahmanyam <[EMAIL PROTECTED]>
> Reply-To: "[EMAIL PROTECTED]" <
> [EMAIL PROTECTED]>
> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> Subject: issue with map running time
> Hi ,
> I have a job which has let us say 10 mappers running in parallel.
> Some are running fast but few of them are taking too long to run.
> For example few mappers are taking 5 to 10 mins but others are taking
> around 12 hours or more.
> Does the difference in the data handled by the mappers can cause such a
> variation or is it the issue with connectivity.
> Note:The cluster we are using have multiple users running their jobs on it.
> Thanks in advance.