Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Hadoop processing


+
Kartashov, Andy 2012-11-08, 14:35
+
Jay Vyas 2012-11-08, 14:49
Copy link to this message
-
Re: Hadoop processing
Mohammad Tariq 2012-11-08, 15:05
Hello Andy,

     Just to add to what Mr. Jay has said, MR framework does its best to
run the map task on a node where the input data is present. Sometimes,
however, all the nodes(based on the replication factor) hosting the data
block for a map task’s input split don't have any free slots. In that case,
the job scheduler will look for a free map slot on a node in the same rack
as one of the blocks. Very occasionally even this is not possible, so an
off-rack node is used

Regards,
    Mohammad Tariq

On Thu, Nov 8, 2012 at 8:19 PM, Jay Vyas <[EMAIL PROTECTED]> wrote:

> Hmm this is interesting.  I think that:
>
> 1) For the map phases, hadoop is smart enough to try to run mappers
> locally, but i think you could force these DNs to actively participate in a
> Mapper job by decreasing the size of input splits, which would allow for
> many more mappers, some of which would be forced to run on files which were
> not necessarily local - in this scenario, those DNs don't yet have any
> local files on them that would be used for the input.
>
> 2) For the reducer phases - since of course the reducers will be copying
> mapper outputs from all over the cluster, one would expect that your Data
> nodes would naturally take part in this portion of the task if the
> num.reducers parameter was specified.
>
>
> On Thu, Nov 8, 2012 at 9:35 AM, Kartashov, Andy <[EMAIL PROTECTED]>wrote:
>
>>  Hadoopers,
>>
>> “Hadoop ships the code to the data instead of sending the data to the
>> code.”
>>
>> Say you added two DNs/TTs to the cluster. They have no data at this
>> point, i.e. you have not ran the balancer.
>>
>> In view of the above quoted statement, will these two nodes not
>> participate in the MapReduce job until you balanced some data onto those
>> nodes? Please kindly elaborate.
>>
>>
>>
>> Rgds,
>>
>> AK47
>>  NOTICE: This e-mail message and any attachments are confidential,
>> subject to copyright and may be privileged. Any unauthorized use, copying
>> or disclosure is prohibited. If you are not the intended recipient, please
>> delete and contact the sender immediately. Please consider the environment
>> before printing this e-mail. AVIS : le présent courriel et toute pièce
>> jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur
>> et peuvent être couverts par le secret professionnel. Toute utilisation,
>> copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le
>> destinataire prévu de ce courriel, supprimez-le et contactez immédiatement
>> l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent
>> courriel
>>
>
>
>
> --
> Jay Vyas
> http://jayunit100.blogspot.com
>
+
Michael Segel 2012-11-08, 15:03
+
Kartashov, Andy 2012-11-08, 15:57