Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - DataNodes fail to send heartbeat to HA-enabled NameNode


Copy link to this message
-
Re: DataNodes fail to send heartbeat to HA-enabled NameNode
Todd Lipcon 2012-10-30, 20:16
BTW, I forgot that I did file a ticket a while back on a related issue:
https://issues.apache.org/jira/browse/hdfs-2882

My assumption is that, higher up in the logs, you will find an underlying
issue which caused NPEs later.

-Todd

On Tue, Oct 30, 2012 at 11:23 AM, Todd Lipcon <[EMAIL PROTECTED]> wrote:

> Hi Takahiko,
>
> Can you please provide the full datanode log up to the point where you
> first see an NPE?
>
> FWIW, this error has nothing to do with the new QuorumJournalManager
> feature -- I've seen this bug once or twice over the last couple years but
> never been able to reproduce it reliably.
>
> -Todd
>
>
> On Tue, Oct 30, 2012 at 10:06 AM, Steve Loughran <[EMAIL PROTECTED]>wrote:
>
>>
>>
>> On 30 October 2012 11:10, Takahiko Kawasaki <[EMAIL PROTECTED]> wrote:
>>
>>> Hello,
>>>
>>> I have trouble in quorum-based HDFS HA of CDH 4.1.1.
>>>
>>> 2012-10-30 19:28:16,817 ERROR
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in
>>> BPOfferService for Block pool
>>> BP-2063217961-192.168.62.231-1351263110470 (storage id
>>> DS-2090122187-192.168.62.233-50010-1338981658216) service to
>>> node02.example.com/192.168.62.232:8020
>>> java.lang.NullPointerException
>>>         at
>>> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:435)
>>>         at
>>> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:521)
>>>         at
>>> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:674)
>>>         at java.lang.Thread.run(Thread.java:662)
>>> --------------------
>>>
>>>
>> look like you've been the first person to find an issue in some code that
>> is very, very fresh.
>>
>> File a bug report on JIRA; try to replicate it on the latest apache alpha
>> release if you can.
>>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

--
Todd Lipcon
Software Engineer, Cloudera