Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Waiting for accumulo to be initialized


Copy link to this message
-
Re: Waiting for accumulo to be initialized
Krishmin,

Thank you for the response. Its always great to hear from someone who has
tried out the steps (even if you had a different issue). Like you said I am
not really sure what caused the crash in our evn in the first place but
having a plan is always good...

Thanks again all,
Aji
On Wed, Mar 27, 2013 at 5:00 PM, Krishmin Rai <[EMAIL PROTECTED]> wrote:

> Hi Aji,
> I wrote the original question linked below (about re-initing Accumulo over
> an existing installation).  For what it's worth, I believe that my
> ZooKeeper data loss was related to the linux+java leap second bug<https://access.redhat.com/knowledge/articles/15145> -- not
> likely to be affecting you now (I did not go back and attempt to re-create
> the issue, so it's also possible there were other compounding issues). We
> have not encountered any ZK data-loss problems since.
>
> At the time, I did some basic experiments to understand the process
> better, and successfully followed (essentially) the steps Eric has
> described. The only real difficulty I had was identifying which directories
> corresponded to which tables; I ended up iterating over individual RFiles
> and manually identifying tables based on expected data. This was a somewhat
> painful process, but at least made me confident that it would be possible
> in production.
>
> It's also important to note that, at least according to my understanding,
> this procedure still potentially loses data: mutations written after the
> last minor compaction will only have reached the write-ahead-logs and will
> not be available in the raw RFiles you're importing from.
>
> -Krishmin
>
> On Mar 27, 2013, at 4:45 PM, Aji Janis wrote:
>
> Eric, Really appreciate you jotting this down. Too late to try it out this
> time but will give this a try (if, hopefully not) there is a next time to
> be had.
>
> Thanks again.
>
>
>
> On Wed, Mar 27, 2013 at 4:19 PM, Eric Newton <[EMAIL PROTECTED]>wrote:
>
>> I should write this up in the user manual.  It's not that hard, but it's
>> really not the first thing you want to tackle while learning how to use
>> accumulo.  I just opened ACCUMULO-1217<https://issues.apache.org/jira/browse/ACCUMULO-1217> to
>> do that.
>>
>> I wrote this from memory: expect errors.  Needless to say, you would only
>> want to do this when you are more comfortable with hadoop, zookeeper and
>> accumulo.
>>
>> First, get zookeeper up and running, even if you have delete all its
>> data.
>>
>> Next, attempt to determine the mapping of table names to tableIds.  You
>> can do this in the shell when your accumulo instance is healthy.  If it
>> isn't healthy, you will have to guess based on the data in the files in
>> HDFS.
>>
>> So, for example, the table "trace" is probably table id "1".  You can
>> find the files for trace in /accumulo/tables/1.
>>
>> Don't worry if you get the names wrong.  You can always rename the tables
>> later.
>>
>> Move the old files for accumulo out of the way and re-initialize:
>>
>> $ hadoop fs -mv /accumulo /accumulo-old
>> $ ./bin/accumulo init
>> $ ./bin/start-all.sh
>>
>> Recreate your tables:
>>
>> $ ./bin/accumulo shell -u root -p mysecret
>> shell > createtable table1
>>
>> Learn the new table id mapping:
>> shell > tables -l
>> !METADATA => !0
>> trace => 1
>> table1 => 2
>> ...
>>
>> Bulk import all your data back into the new table ids:
>> Assuming you have determined that "table1" used to be table id "a" and is
>> now "2",
>> you do something like this:
>>
>> $ hadoop fs -mkdir /tmp/failed
>> $ ./bin/accumulo shell -u root -p mysecret
>> shell > table table1
>> shell table1 > importdirectory /accumulo-old/tables/a/default_tablet
>> /tmp/failed true
>>
>> There are lots of directories under every table id directory.  You will
>> need to import each of them.  I suggest creating a script and passing it to
>> the shell on the command line.
>>
>> I know of instances in which trillions of entries were recovered and
>> available in a matter of hours.
>>
>> -Eric
>>
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB