Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Re: DataNode and Tasttracker communication


Copy link to this message
-
Re: DataNode and Tasttracker communication
Thank you so very much for the detailed response Michael. I'll keep the tip
in mind. Please pardon my ignorance, as I am still in the learning phase.

Regards,
    Mohammad Tariq

On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel <[EMAIL PROTECTED]>wrote:

> 0.0.0.0 means that the call is going to all interfaces on the machine.
>  (Shouldn't be an issue...)
>
> IPv4 vs IPv6? Could be an issue, however OP says he can write data to DNs
> and they seem to communicate, therefore if its IPv6 related, wouldn't it
> impact all traffic and not just a specific port?
> I agree... shut down IPv6 if you can.
>
> I don't disagree with your assessment. I am just suggesting that before
> you do a really deep dive, you think about the more obvious stuff first.
>
> There are a couple of other things... like do all of the /etc/hosts files
> on all of the machines match?
> Is the OP using both /etc/hosts and DNS? If so, are they in sync?
>
> BTW, you said DNS in your response. if you're using DNS, then you don't
> really want to have much info in the /etc/hosts file except loopback and
> the server's IP address.
>
> Looking at the problem OP is indicating some traffic works, while other
> traffic doesn't. Most likely something is blocking the ports. Iptables is
> the first place to look.
>
> Just saying. ;-)
>
>
> On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <[EMAIL PROTECTED]> wrote:
>
> Hi Michael,
>        I asked for hosts file because there seems to be some loopback prob
> to me. The log shows that call is going at 0.0.0.0. Apart from what you
> have said, I think disabling IPv6 and making sure that there is no prob
> with the DNS resolution is also necessary. Please correct me if I am wrong.
> Thank you.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <[EMAIL PROTECTED]>wrote:
>
>> Based on your /etc/hosts output, why aren't you using DNS?
>>
>> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't
>> generally work well when you're not using the FQDN or its alias.
>>
>> The issue isn't the SSH, but if you go to the node which is having
>> trouble connecting to another node,  then try to ping it, or some other
>> general communication,  if it succeeds, your issue is that the port you're
>> trying to communicate with is blocked.  Then its more than likely an
>> ipconfig or firewall issue.
>>
>> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <[EMAIL PROTECTED]>
>> wrote:
>>
>>  Hi Michael,
>>
>> well i can ssh from any node to any other without being prompted. The
>> reason for this is, that my home dir is mounted in every server in the
>> cluster.
>>
>> If the machines are multihomed: i dont know. i could ask if this would be
>> of importance.
>>
>> Shall i?
>>
>> Regards,
>> Elmar
>>
>> Am 13.08.12 14:59, schrieb Michael Segel:
>>
>> If the nodes can communicate and distribute data, then the odds are that
>> the issue isn't going to be in his /etc/hosts.
>>
>>  A more relevant question is if he's running a firewall on each of these
>> machines?
>>
>>  A simple test... ssh to one node, ping other nodes and the control
>> nodes at random to see if they can see one another. Then check to see if
>> there is a firewall running which would limit the types of traffic between
>> nodes.
>>
>>  One other side note... are these machines multi-homed?
>>
>>   On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <[EMAIL PROTECTED]> wrote:
>>
>> Hello there,
>>
>>       Could you please share your /etc/hosts file, if you don't mind.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Hi,
>>>
>>> i am currently trying to run my hadoop program on a cluster. Sadly
>>> though my datanodes and tasktrackers seem to have difficulties with their
>>> communication as their logs say:
>>> * Some datanodes and tasktrackers seem to have portproblems of some kind
>>> as it can be seen in the logs below. I wondered if this might be due to