Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo, mail # user - No Recovery Node In Zookeeper


Copy link to this message
-
Re: No Recovery Node In Zookeeper
William Slacum 2012-06-13, 03:21
Just to clarify, I don't think your code breaks anything-- it's more
robust than simply assuming some state exists in ZooKeeper and is more
complete in terms of its error handling. I'm concerned about what
happened before hand that would have violated the assumptions made by
the master in this case.

On Tue, Jun 12, 2012 at 11:01 PM, William Slacum <[EMAIL PROTECTED]> wrote:
> Yes, but there's a reason the master was expecting the recovery nodes
> to exist, and I don't think that reason has been uncovered.
>
> On Tue, Jun 12, 2012 at 10:46 PM, David Medinets
> <[EMAIL PROTECTED]> wrote:
>> This code does not avoid the recovery entries, it just checks that the
>> entries exist before looping over them.
>>
>> On Tue, Jun 12, 2012 at 10:42 PM, William Slacum <[EMAIL PROTECTED]> wrote:
>>> Does this just address your symptom? I'd be concerned that there was a
>>> recovery issue that put the Accumulo instance in this state and with
>>> the change in effect nobody would know about it.
>>>
>>> On Tue, Jun 12, 2012 at 10:25 PM, David Medinets
>>> <[EMAIL PROTECTED]> wrote:
>>>> I am greping source left and right but am not sure what to make of
>>>> this error. Here is the code from Master.java:
>>>>
>>>>    ZooReaderWriter.getInstance().getChildren(zroot +
>>>> Constants.ZRECOVERY, new Watcher() {
>>>>      @Override
>>>>      public void process(WatchedEvent event) {
>>>>        nextEvent.event("Noticed recovery changes", event.getType());
>>>>      }
>>>>    });
>>>>
>>>> I suggest replacing the above code with this:
>>>>
>>>>    final String recoveryPath = zroot + Constants.ZRECOVERY;
>>>>    Stat stat >>>> ZooReaderWriter.getInstance().getZooKeeper().exists(recoveryPath,
>>>> null);
>>>>    if (stat != null && stat.getNumChildren() > 0) {
>>>>      ZooReaderWriter.getInstance().getChildren(recoveryPath, new Watcher() {
>>>>        @Override
>>>>        public void process(WatchedEvent event) {
>>>>          nextEvent.event("Noticed recovery changes", event.getType());
>>>>        }
>>>>      });
>>>>    }
>>>>
>>>> I have changed my local Accumulo and this change seems to be Ok.
>>>> However, since this is a change to Accumulo itself, I would like
>>>> someone to code review before I commit this change. Does this change
>>>> make sense?
>>>>
>>>> On Mon, Jun 11, 2012 at 9:54 PM, David Medinets
>>>> <[EMAIL PROTECTED]> wrote:
>>>>> I am slowly working my way through whatever went wrong on my system.
>>>>> This is the latest. I've deleted the logs and started the master by
>>>>> hand:
>>>>>
>>>>> accumulo org.apache.accumulo.server.master.state.SetGoalState NORMAL
>>>>> start-server.sh localhost master
>>>>>
>>>>> Then checked the log files where I saw this message:
>>>>>
>>>>> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode
>>>>> = NoNode for /accumulo/b519799c-3a51-4c9b-af21-96d577e2c11f/recovery
>>>>>        at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
>>>>>        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>>>>>        at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1448)
>>>>>        at org.apache.accumulo.core.zookeeper.ZooReader.getChildren(ZooReader.java:62)
>>>>>        at org.apache.accumulo.server.master.Master.run(Master.java:2071)
>>>>>        at org.apache.accumulo.server.master.Master.main(Master.java:2173)
>>>>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>        at java.lang.reflect.Method.invoke(Method.java:601)
>>>>>
>>>>> I've run out of time for debugging today. I'll dig into the source
>>>>> code more tomorrow ... until someone can point me in the right
>>>>> direction to resolve this?