Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> DataNode and Tasttracker communication


+
Björn-Elmar Macek 2012-08-13, 12:31
+
Mohammad Tariq 2012-08-13, 12:51
+
Björn-Elmar Macek 2012-08-13, 13:08
+
Michael Segel 2012-08-13, 12:59
+
Björn-Elmar Macek 2012-08-13, 13:17
+
Michael Segel 2012-08-13, 13:36
Copy link to this message
-
Re: DataNode and Tasttracker communication
Hi Michael,
       I asked for hosts file because there seems to be some loopback prob
to me. The log shows that call is going at 0.0.0.0. Apart from what you
have said, I think disabling IPv6 and making sure that there is no prob
with the DNS resolution is also necessary. Please correct me if I am wrong.
Thank you.

Regards,
    Mohammad Tariq

On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <[EMAIL PROTECTED]>wrote:

> Based on your /etc/hosts output, why aren't you using DNS?
>
> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't
> generally work well when you're not using the FQDN or its alias.
>
> The issue isn't the SSH, but if you go to the node which is having trouble
> connecting to another node,  then try to ping it, or some other general
> communication,  if it succeeds, your issue is that the port you're trying
> to communicate with is blocked.  Then its more than likely an ipconfig or
> firewall issue.
>
> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <[EMAIL PROTECTED]>
> wrote:
>
>  Hi Michael,
>
> well i can ssh from any node to any other without being prompted. The
> reason for this is, that my home dir is mounted in every server in the
> cluster.
>
> If the machines are multihomed: i dont know. i could ask if this would be
> of importance.
>
> Shall i?
>
> Regards,
> Elmar
>
> Am 13.08.12 14:59, schrieb Michael Segel:
>
> If the nodes can communicate and distribute data, then the odds are that
> the issue isn't going to be in his /etc/hosts.
>
>  A more relevant question is if he's running a firewall on each of these
> machines?
>
>  A simple test... ssh to one node, ping other nodes and the control nodes
> at random to see if they can see one another. Then check to see if there is
> a firewall running which would limit the types of traffic between nodes.
>
>  One other side note... are these machines multi-homed?
>
>   On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <[EMAIL PROTECTED]> wrote:
>
> Hello there,
>
>       Could you please share your /etc/hosts file, if you don't mind.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <[EMAIL PROTECTED]
> > wrote:
>
>> Hi,
>>
>> i am currently trying to run my hadoop program on a cluster. Sadly though
>> my datanodes and tasktrackers seem to have difficulties with their
>> communication as their logs say:
>> * Some datanodes and tasktrackers seem to have portproblems of some kind
>> as it can be seen in the logs below. I wondered if this might be due to
>> reasons correllated with the localhost entry in /etc/hosts as you can read
>> in alot of posts with similar errors, but i checked the file neither
>> localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping
>> localhost... the technician of the cluster said he'd be looking for the
>> mechanics resolving localhost)
>> * The other nodes can not speak with the namenode and jobtracker
>> (its-cs131). Although it is absolutely not clear, why this is happening:
>> the "dfs -put" i do directly before the job is running fine, which seems to
>> imply that communication between those servers is working flawlessly.
>>
>> Is there any reason why this might happen?
>>
>>
>> Regards,
>> Elmar
>>
>> LOGS BELOW:
>>
>> \____Datanodes
>>
>> After successfully putting the data to hdfs (at this point i thought
>> namenode and datanodes have to communicate), i get the following errors
>> when starting the job:
>>
>> There are 2 kinds of logs i found: the first one is big (about 12MB) and
>> looks like this:
>> ############################### LOG TYPE 1
>> ############################################################
>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>> time(s).
>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 1
>> time(s).
>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying
+
Björn-Elmar Macek 2012-08-13, 14:57
+
James Brown 2012-08-14, 06:51
+
Sriram Ramachandrasekaran... 2012-08-13, 16:37
+
Michael Segel 2012-08-13, 20:39
+
Björn-Elmar Macek 2012-08-16, 13:17
+
Björn-Elmar Macek 2012-08-20, 10:15
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB