Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Distributing the code to multiple nodes


Copy link to this message
-
Re: Distributing the code to multiple nodes
Logs were updated only when I copied the data. After copying the data there
has been no updates on the log files.
On Thu, Jan 9, 2014 at 5:08 PM, Chris Mawata <[EMAIL PROTECTED]> wrote:

> Do the logs on the three nodes contain anything interesting?
> Chris
>  On Jan 9, 2014 3:47 AM, "Ashish Jain" <[EMAIL PROTECTED]> wrote:
>
>> Here is the block info for the record I distributed. As can be seen only
>> 10.12.11.210 has all the data and this is the node which is serving all the
>> request. Replicas are available with 209 as well as 210
>>
>> 1073741857:         10.12.11.210:50010    View Block Info
>> 10.12.11.209:50010    View Block Info
>> 1073741858:         10.12.11.210:50010    View Block Info
>> 10.12.11.211:50010    View Block Info
>> 1073741859:         10.12.11.210:50010    View Block Info
>> 10.12.11.209:50010    View Block Info
>> 1073741860:         10.12.11.210:50010    View Block Info
>> 10.12.11.211:50010    View Block Info
>> 1073741861:         10.12.11.210:50010    View Block Info
>> 10.12.11.209:50010    View Block Info
>> 1073741862:         10.12.11.210:50010    View Block Info
>> 10.12.11.209:50010    View Block Info
>> 1073741863:         10.12.11.210:50010    View Block Info
>> 10.12.11.209:50010    View Block Info
>> 1073741864:         10.12.11.210:50010    View Block Info
>> 10.12.11.209:50010    View Block Info
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> --Ashish
>>
>>
>> On Thu, Jan 9, 2014 at 2:11 PM, Ashish Jain <[EMAIL PROTECTED]> wrote:
>>
>>> Hello Chris,
>>>
>>> I have now a cluster with 3 nodes and replication factor being 2. When I
>>> distribute a file I could see that there are replica of data available in
>>> other nodes. However when I run a map reduce job again only one node is
>>> serving all the request :(. Can you or anyone please provide some more
>>> inputs.
>>>
>>> Thanks
>>> Ashish
>>>
>>>
>>> On Wed, Jan 8, 2014 at 7:16 PM, Chris Mawata <[EMAIL PROTECTED]>wrote:
>>>
>>>> 2 nodes and replication factor of 2 results in a replica of each block
>>>> present on each node. This would allow the possibility that a single node
>>>> would do the work and yet be data local.  It will probably happen if that
>>>> single node has the needed capacity.  More nodes than the replication
>>>> factor are needed to force distribution of the processing.
>>>>  Chris
>>>> On Jan 8, 2014 7:35 AM, "Ashish Jain" <[EMAIL PROTECTED]> wrote:
>>>>
>>>>> Guys,
>>>>>
>>>>> I am sure that only one node is being used. I just know ran the job
>>>>> again and could see that CPU usage only for one server going high other
>>>>> server CPU usage remains constant and hence it means other node is not
>>>>> being used. Can someone help me to debug this issue?
>>>>>
>>>>> ++Ashish
>>>>>
>>>>>
>>>>> On Wed, Jan 8, 2014 at 5:04 PM, Ashish Jain <[EMAIL PROTECTED]>wrote:
>>>>>
>>>>>> Hello All,
>>>>>>
>>>>>> I have a 2 node hadoop cluster running with a replication factor of
>>>>>> 2. I have a file of size around 1 GB which when copied to HDFS is
>>>>>> replicated to both the nodes. Seeing the block info I can see the file has
>>>>>> been subdivided into 8 parts which means it has been subdivided into 8
>>>>>> blocks each of size 128 MB.  I use this file as input to run the word count
>>>>>> program. Some how I feel only one node is doing all the work and the code
>>>>>> is not distributed to other node. How can I make sure code is distributed
>>>>>> to both the nodes? Also is there a log or GUI which can be used for this?
>>>>>> Please note I am using the latest stable release that is 2.2.0.
>>>>>>
>>>>>> ++Ashish
>>>>>>
>>>>>
>>>>>
>>>
>>