I was going through the definition of Uber Job of Hadoop.
A job is considered uber when it has 10 or less maps , one reducer and the
complete data is less than one dfs block size.
I have some doubts here-
Splits are created as per the dfs block size.Creating 10 mappers are
possible from one block of data by some settings change (changing the max
split size). But trying to understand , why would some job need to run
around 10 maps for 64 MB of data.
One thing may be that the job is immensely CUP intensive. Will it be a
correct assumption? or is there is any other reason for this.