Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> How to re-IP an Accumulo Cluster


Copy link to this message
-
Re: How to re-IP an Accumulo Cluster
Use zkCli.sh and look in /accumulo/<accumulo instance id>

In 1.4 Accumulo started locking its info in zookeeper down, so you may need
to execute the following command :

  addauth digest accumulo:SECRET

Replace SECRET with the secret from your accumulo-site.xml file.
On Thu, Aug 15, 2013 at 12:05 PM, Terry P. <[EMAIL PROTECTED]> wrote:

> Hi Keith,
> Many thanks for your detailed reply. I forgot to mention that yes indeed
> this is on Accumulo 1.4.2, and it was the write-ahead logs that were the
> issue -- partly because two of the tabletservers were not properly shutdown
> before the re-IP operation, so recovery may have been needed on them.
>
> My naivety on Zookeeper certainly hampered the research as well.  How does
> one "look in zookeeper to see what is going on?"  Any pointers would be
> really helpful.
>
> I wish we could go to 1.5 and take advantage of the walogs in HDFS, but no
> can do at this point unfortunately.
>
>
> On Thu, Aug 15, 2013 at 10:24 AM, Keith Turner <[EMAIL PROTECTED]> wrote:
>
>>
>>
>>
>> On Thu, Aug 15, 2013 at 11:01 AM, Terry P. <[EMAIL PROTECTED]> wrote:
>>
>>> Greetings everyone,
>>> We had to re-IP our entire cluster recently to change subnetworks, and
>>> we essentially lost everything (it was development, so no big deal).
>>> However, doing a re-IP operation may be required in actual operational
>>> cases, and I'd like to know if it can be done or not so we can note it for
>>> the future (as in document "what not to do" to avoid data loss).
>>>
>>> The issue we had was that after shutting down the cluster, re-IPing all
>>> servers, and starting everything back up, the tablets were still assigned
>>> to Tabletservers with the old IP addresses, even though all the hostnames
>>> were the same.  So the system showed 3 Tabletservers, but no tablets, and
>>> no entries in the tables where previously there were 400 million.
>>>
>>> So:
>>>
>>> A) Does Zookeeper track Tabletservers by IP address only, and not
>>> hostname?
>>>
>>
>> It does track by IP address, but not only IP address.  Each tablet server
>> has an ephemeral node in zookeeper under the IP address.  This ehpemeral
>> node should go away when the tserver process dies, and then the master will
>> assume that tserver is dead.  The location of a tablet in the metadata
>> table is conceptually <ephemeral node id>+<IP address>, so once that
>> ephemeral node goes away the location in metadata table is assumed invalid
>> and the tablet is reassigned.   If another tserver starts at the same IP,
>> then the master can differentiate because the ephemeral node is different.
>>
>> You can look at the children nodes under a tserver ip in zookeeper.  Look
>> at the data for the lowest numbered ephemeral node to to get infor about
>> who holds the lock for that IP.
>>
>>
>>
>>
>>> B) If A is true, is there a mechanism to change those entries in
>>> Zookeeper so that a re-IP operation could be performed?
>>>
>>
>> A first step would be to look in zookeeper and see what going on with the
>> ephemeral nodes.
>>
>> In Accumulo 1.3 and 1.4 one thing that normally causes problems when
>> changing lots of IP addrs is write ahead logs.   Tablets point to their
>> write ahead logs using the IP address of the logger. This can cause walog
>> recovery to fail.  In 1.5 walog are stored in HDFS so this not an issue.
>>
>>
>