|
|
Hi.
I have a trouble with my Accumulo installation. After hardware failure on NameNode, !METATABLE's root_tables is broken :(
>From "fsck /" output: .... /accumulo/tables/!0/root_tablet/A000ornd.rf: CORRUPT block blk_-8590712379082603283 /accumulo/tables/!0/root_tablet/A000ornd.rf: MISSING 1 blocks of total size 896 B.. .... What could you recommend to recover the data? Is it possible to reconstruct !METATABLE's root_tablet based on the rest of !METATABLE files ? Or is possible to reconstruct the whole !METATABLE based on th content of the all found tablets ? Are there any ready tools to do it ?
Thanks.
+
Denis 2012-08-19, 03:08
John Vines 2012-08-19, 04:10
When you have a namenode failure and you recover with teh Secondary Namenode info, you're dealing with one level of potentially expired pointers. On top of that, you have more layers of pointers WRT the root tablet and !METADATA tablets. You can make attempts to recover, but what is more apt to happen is you'll get a Root tablet up that has some, but not all of the current !METADATA table files. And then the ones you get do get up may or may not be pointing to the existing files for your tablets.
What I'm ultimately trying to say is that you already lost some files, you are more apt to lose more by trying to recover your old information instead of taking what you have and starting over. I would suggest taking your accumulo directory, moving it to accumulo_old or something along those lines, reinstantiate an instance, and begin bulk importing the remaining old information back into the new system.
John
On Sat, Aug 18, 2012 at 11:08 PM, Denis <[EMAIL PROTECTED]> wrote:
> Hi. > > I have a trouble with my Accumulo installation. > After hardware failure on NameNode, !METATABLE's root_tables is broken :( > > From "fsck /" output: > .... > /accumulo/tables/!0/root_tablet/A000ornd.rf: CORRUPT block > blk_-8590712379082603283 > /accumulo/tables/!0/root_tablet/A000ornd.rf: MISSING 1 blocks of total > size 896 B.. > .... > > > What could you recommend to recover the data? > Is it possible to reconstruct !METATABLE's root_tablet based on the > rest of !METATABLE files ? > Or is possible to reconstruct the whole !METATABLE based on th content > of the all found tablets ? > Are there any ready tools to do it ? > > Thanks. >
+
John Vines 2012-08-19, 04:10
Hi.
That is what am going to do. I still have terabytes of .rf files (!METADATA was the only table affected by the crash) and gigabytes of walog files and I am trying to extract info from them and insert into new database.
Do you know any dumping tools for .rf and walog? (i started to create my own, but if there are existing ones, it could help to save the time). Am I right in understanding, that if all the content of .rf and walog will just be inserted into new db, VersioningIterator will remove all collision they may have ?
On 8/19/12, John Vines <[EMAIL PROTECTED]> wrote: > When you have a namenode failure and you recover with teh Secondary > Namenode info, you're dealing with one level of potentially expired > pointers. On top of that, you have more layers of pointers WRT the root > tablet and !METADATA tablets. You can make attempts to recover, but what is > more apt to happen is you'll get a Root tablet up that has some, but not > all of the current !METADATA table files. And then the ones you get do get > up may or may not be pointing to the existing files for your tablets. > > What I'm ultimately trying to say is that you already lost some files, you > are more apt to lose more by trying to recover your old information instead > of taking what you have and starting over. I would suggest taking your > accumulo directory, moving it to accumulo_old or something along those > lines, reinstantiate an instance, and begin bulk importing the remaining > old information back into the new system. > > John > > On Sat, Aug 18, 2012 at 11:08 PM, Denis <[EMAIL PROTECTED]> wrote: > >> Hi. >> >> I have a trouble with my Accumulo installation. >> After hardware failure on NameNode, !METATABLE's root_tables is broken :( >> >> From "fsck /" output: >> .... >> /accumulo/tables/!0/root_tablet/A000ornd.rf: CORRUPT block >> blk_-8590712379082603283 >> /accumulo/tables/!0/root_tablet/A000ornd.rf: MISSING 1 blocks of total >> size 896 B.. >> .... >> >> >> What could you recommend to recover the data? >> Is it possible to reconstruct !METATABLE's root_tablet based on the >> rest of !METATABLE files ? >> Or is possible to reconstruct the whole !METATABLE based on th content >> of the all found tablets ? >> Are there any ready tools to do it ? >> >> Thanks. >> >
+
Denis 2012-08-19, 04:50
Keith Turner 2012-08-20, 18:46
On Sun, Aug 19, 2012 at 12:50 AM, Denis <[EMAIL PROTECTED]> wrote: > Hi. > > That is what am going to do. > I still have terabytes of .rf files (!METADATA was the only table > affected by the crash) and gigabytes of walog files and I am trying to > extract info from them and insert into new database. > > Do you know any dumping tools for .rf and walog? (i started to create > my own, but if there are existing ones, it could help to save the > time). > Am I right in understanding, that if all the content of .rf and walog > will just be inserted into new db, VersioningIterator will remove all > collision they may have ?
One thing to be aware of is that deleted data could come back when you just import all of the files. This can happen because tablets can reference files that contain data outside of the tablets range. This happens as a result of splits and bulk imports. Below is an example of this.
* Tablet1 refs file F1 which contains X * Tablet1 splits into Tablet2 and Tablet3. Both tablets reference file F1. X falls within Tablet 3. * X is deleted from Tablet3 * Tablet3 compacts to file F2, F2 does not contain X
After the sequence of events above, X still exist in file F1 even though it was deleted and compacted away in Tablet3. Normally this is not a problem because Tablet2 does not read the part of file F1 that contains X. However if you just take F1 and import the file, then X will come back. The fact that you are only interested in a portion of file F1 is lost with the !METADATA table.
> > On 8/19/12, John Vines <[EMAIL PROTECTED]> wrote: >> When you have a namenode failure and you recover with teh Secondary >> Namenode info, you're dealing with one level of potentially expired >> pointers. On top of that, you have more layers of pointers WRT the root >> tablet and !METADATA tablets. You can make attempts to recover, but what is >> more apt to happen is you'll get a Root tablet up that has some, but not >> all of the current !METADATA table files. And then the ones you get do get >> up may or may not be pointing to the existing files for your tablets. >> >> What I'm ultimately trying to say is that you already lost some files, you >> are more apt to lose more by trying to recover your old information instead >> of taking what you have and starting over. I would suggest taking your >> accumulo directory, moving it to accumulo_old or something along those >> lines, reinstantiate an instance, and begin bulk importing the remaining >> old information back into the new system. >> >> John >> >> On Sat, Aug 18, 2012 at 11:08 PM, Denis <[EMAIL PROTECTED]> wrote: >> >>> Hi. >>> >>> I have a trouble with my Accumulo installation. >>> After hardware failure on NameNode, !METATABLE's root_tables is broken :( >>> >>> From "fsck /" output: >>> .... >>> /accumulo/tables/!0/root_tablet/A000ornd.rf: CORRUPT block >>> blk_-8590712379082603283 >>> /accumulo/tables/!0/root_tablet/A000ornd.rf: MISSING 1 blocks of total >>> size 896 B.. >>> .... >>> >>> >>> What could you recommend to recover the data? >>> Is it possible to reconstruct !METATABLE's root_tablet based on the >>> rest of !METATABLE files ? >>> Or is possible to reconstruct the whole !METATABLE based on th content >>> of the all found tablets ? >>> Are there any ready tools to do it ? >>> >>> Thanks. >>> >>
+
Keith Turner 2012-08-20, 18:46
William Slacum 2012-08-19, 12:24
I know in trunk, the ability to run `./bin/accumulo rfile-info -d /accumulo/path/to/rfile`. If that's unavailable, you can run `./bin/accumulo org.apache.accumulo.core.file.rfile.PrintInfo -d /accumulo/path/to/rfile`. I'll defer to someone else for the walogs.
On Sun, Aug 19, 2012 at 12:50 AM, Denis <[EMAIL PROTECTED]> wrote:
> Hi. > > That is what am going to do. > I still have terabytes of .rf files (!METADATA was the only table > affected by the crash) and gigabytes of walog files and I am trying to > extract info from them and insert into new database. > > Do you know any dumping tools for .rf and walog? (i started to create > my own, but if there are existing ones, it could help to save the > time). > Am I right in understanding, that if all the content of .rf and walog > will just be inserted into new db, VersioningIterator will remove all > collision they may have ? > > On 8/19/12, John Vines <[EMAIL PROTECTED]> wrote: > > When you have a namenode failure and you recover with teh Secondary > > Namenode info, you're dealing with one level of potentially expired > > pointers. On top of that, you have more layers of pointers WRT the root > > tablet and !METADATA tablets. You can make attempts to recover, but what > is > > more apt to happen is you'll get a Root tablet up that has some, but not > > all of the current !METADATA table files. And then the ones you get do > get > > up may or may not be pointing to the existing files for your tablets. > > > > What I'm ultimately trying to say is that you already lost some files, > you > > are more apt to lose more by trying to recover your old information > instead > > of taking what you have and starting over. I would suggest taking your > > accumulo directory, moving it to accumulo_old or something along those > > lines, reinstantiate an instance, and begin bulk importing the remaining > > old information back into the new system. > > > > John > > > > On Sat, Aug 18, 2012 at 11:08 PM, Denis <[EMAIL PROTECTED]> wrote: > > > >> Hi. > >> > >> I have a trouble with my Accumulo installation. > >> After hardware failure on NameNode, !METATABLE's root_tables is broken > :( > >> > >> From "fsck /" output: > >> .... > >> /accumulo/tables/!0/root_tablet/A000ornd.rf: CORRUPT block > >> blk_-8590712379082603283 > >> /accumulo/tables/!0/root_tablet/A000ornd.rf: MISSING 1 blocks of total > >> size 896 B.. > >> .... > >> > >> > >> What could you recommend to recover the data? > >> Is it possible to reconstruct !METATABLE's root_tablet based on the > >> rest of !METATABLE files ? > >> Or is possible to reconstruct the whole !METATABLE based on th content > >> of the all found tablets ? > >> Are there any ready tools to do it ? > >> > >> Thanks. > >> > > >
+
William Slacum 2012-08-19, 12:24
Eric Newton 2012-08-19, 15:44
$ ./bin/accumulo org.apache.accumulo.server,logger.LogReader /some/walog/filename
The log reader truncates long mutations by default. Tablets names are compressed to unique ids, so you will see DEFINE_TABLET entries, which map the tablet to a tablet id, and then the tablet id is used in the recorded mutations.
You will want to keep the Accumulo garbage collector offline until you are done.
-Eric
On Sun, Aug 19, 2012 at 8:24 AM, William Slacum < [EMAIL PROTECTED]> wrote:
> I know in trunk, the ability to run `./bin/accumulo rfile-info -d > /accumulo/path/to/rfile`. If that's unavailable, you can run > `./bin/accumulo org.apache.accumulo.core.file.rfile.PrintInfo > -d /accumulo/path/to/rfile`. I'll defer to someone else for the walogs. > > On Sun, Aug 19, 2012 at 12:50 AM, Denis <[EMAIL PROTECTED]> wrote: > >> Hi. >> >> That is what am going to do. >> I still have terabytes of .rf files (!METADATA was the only table >> affected by the crash) and gigabytes of walog files and I am trying to >> extract info from them and insert into new database. >> >> Do you know any dumping tools for .rf and walog? (i started to create >> my own, but if there are existing ones, it could help to save the >> time). >> Am I right in understanding, that if all the content of .rf and walog >> will just be inserted into new db, VersioningIterator will remove all >> collision they may have ? >> >> On 8/19/12, John Vines <[EMAIL PROTECTED]> wrote: >> > When you have a namenode failure and you recover with teh Secondary >> > Namenode info, you're dealing with one level of potentially expired >> > pointers. On top of that, you have more layers of pointers WRT the root >> > tablet and !METADATA tablets. You can make attempts to recover, but >> what is >> > more apt to happen is you'll get a Root tablet up that has some, but not >> > all of the current !METADATA table files. And then the ones you get do >> get >> > up may or may not be pointing to the existing files for your tablets. >> > >> > What I'm ultimately trying to say is that you already lost some files, >> you >> > are more apt to lose more by trying to recover your old information >> instead >> > of taking what you have and starting over. I would suggest taking your >> > accumulo directory, moving it to accumulo_old or something along those >> > lines, reinstantiate an instance, and begin bulk importing the remaining >> > old information back into the new system. >> > >> > John >> > >> > On Sat, Aug 18, 2012 at 11:08 PM, Denis <[EMAIL PROTECTED]> wrote: >> > >> >> Hi. >> >> >> >> I have a trouble with my Accumulo installation. >> >> After hardware failure on NameNode, !METATABLE's root_tables is broken >> :( >> >> >> >> From "fsck /" output: >> >> .... >> >> /accumulo/tables/!0/root_tablet/A000ornd.rf: CORRUPT block >> >> blk_-8590712379082603283 >> >> /accumulo/tables/!0/root_tablet/A000ornd.rf: MISSING 1 blocks of total >> >> size 896 B.. >> >> .... >> >> >> >> >> >> What could you recommend to recover the data? >> >> Is it possible to reconstruct !METATABLE's root_tablet based on the >> >> rest of !METATABLE files ? >> >> Or is possible to reconstruct the whole !METATABLE based on th content >> >> of the all found tablets ? >> >> Are there any ready tools to do it ? >> >> >> >> Thanks. >> >> >> > >> > >
+
Eric Newton 2012-08-19, 15:44
|
|