Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> No Recovery Node In Zookeeper

Copy link to this message
Re: No Recovery Node In Zookeeper
From a brief look, your change looks good. I would be concerned about
not having an 'else' block with a big-fat-warning though. The only
intent seems to be to notify in the logs that recoveries were found to
be processed. In the same fashion, I would expect a log message if the
entire recovery directory was not present.

Not being able to find that directory could signify larger problems
(like those you ran into), and could, possibly, result in a user
silently losing data due to missing recoveries.

I'm not entirely sure of the implications downstream if the recovery dir
doesn't exist (what does the master do if it gets past the check with
you patch and no recovery/ directory exists?). I don't personally don't
know without diving farther into the Master. I would want to get
verification from someone else before making a change that could have
such a large impact.

(poke, poke Eric/Keith/Adam/Billie/John)

- Josh

On 6/12/2012 10:46 PM, David Medinets wrote:
> This code does not avoid the recovery entries, it just checks that the
> entries exist before looping over them.
> On Tue, Jun 12, 2012 at 10:42 PM, William Slacum<[EMAIL PROTECTED]>  wrote:
>> Does this just address your symptom? I'd be concerned that there was a
>> recovery issue that put the Accumulo instance in this state and with
>> the change in effect nobody would know about it.
>> On Tue, Jun 12, 2012 at 10:25 PM, David Medinets
>> <[EMAIL PROTECTED]>  wrote:
>>> I am greping source left and right but am not sure what to make of
>>> this error. Here is the code from Master.java:
>>>     ZooReaderWriter.getInstance().getChildren(zroot +
>>> Constants.ZRECOVERY, new Watcher() {
>>>       @Override
>>>       public void process(WatchedEvent event) {
>>>         nextEvent.event("Noticed recovery changes", event.getType());
>>>       }
>>>     });
>>> I suggest replacing the above code with this:
>>>     final String recoveryPath = zroot + Constants.ZRECOVERY;
>>>     Stat stat >>> ZooReaderWriter.getInstance().getZooKeeper().exists(recoveryPath,
>>> null);
>>>     if (stat != null&&  stat.getNumChildren()>  0) {
>>>       ZooReaderWriter.getInstance().getChildren(recoveryPath, new Watcher() {
>>>         @Override
>>>         public void process(WatchedEvent event) {
>>>           nextEvent.event("Noticed recovery changes", event.getType());
>>>         }
>>>       });
>>>     }
>>> I have changed my local Accumulo and this change seems to be Ok.
>>> However, since this is a change to Accumulo itself, I would like
>>> someone to code review before I commit this change. Does this change
>>> make sense?
>>> On Mon, Jun 11, 2012 at 9:54 PM, David Medinets
>>> <[EMAIL PROTECTED]>  wrote:
>>>> I am slowly working my way through whatever went wrong on my system.
>>>> This is the latest. I've deleted the logs and started the master by
>>>> hand:
>>>> accumulo org.apache.accumulo.server.master.state.SetGoalState NORMAL
>>>> start-server.sh localhost master
>>>> Then checked the log files where I saw this message:
>>>> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode
>>>> = NoNode for /accumulo/b519799c-3a51-4c9b-af21-96d577e2c11f/recovery
>>>>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
>>>>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>>>>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1448)
>>>>         at org.apache.accumulo.core.zookeeper.ZooReader.getChildren(ZooReader.java:62)
>>>>         at org.apache.accumulo.server.master.Master.run(Master.java:2071)
>>>>         at org.apache.accumulo.server.master.Master.main(Master.java:2173)
>>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)