|
Jean-Marc Spaggiari
2012-12-30, 17:08
Ted Yu
2012-12-30, 17:44
Jean-Marc Spaggiari
2012-12-30, 17:53
Ted Yu
2012-12-30, 18:23
Jean-Marc Spaggiari
2012-12-30, 18:37
Jean-Marc Spaggiari
2012-12-30, 18:59
Ted Yu
2012-12-30, 19:11
Ted Yu
2012-12-30, 19:21
Jean-Marc Spaggiari
2012-12-30, 19:25
Jean-Marc Spaggiari
2012-12-30, 19:50
lars hofhansl
2012-12-30, 21:33
Jean-Marc Spaggiari
2012-12-30, 22:15
Ted
2012-12-30, 22:26
Jean-Marc Spaggiari
2012-12-30, 22:42
Jesse Yates
2012-12-31, 00:13
Ted
2012-12-31, 00:22
Ted
2012-12-31, 00:29
|
-
CleanerChore exceptionJean-Marc Spaggiari 2012-12-30, 17:08
Hi,
I have a "IOException" /hbase/.archive/table_name is non empty exception every minute on my logs. There is 30 directories under this directory. the main directory is from yesterday, but all sub directories are from December 10th, all the same time. What does this .archive directory is used for, and what should I do? Thanks, JM
-
Re: CleanerChore exceptionTed Yu 2012-12-30, 17:44
Looks like you're using 0.94.3
The archiver is backport of: HBASE-5547, Don't delete HFiles in backup mode Can you provide more the log where the IOE was reported using pastebin ? Thanks On Sun, Dec 30, 2012 at 9:08 AM, Jean-Marc Spaggiari < [EMAIL PROTECTED]> wrote: > Hi, > > I have a "IOException" /hbase/.archive/table_name is non empty > exception every minute on my logs. > > There is 30 directories under this directory. the main directory is > from yesterday, but all sub directories are from December 10th, all > the same time. > > What does this .archive directory is used for, and what should I do? > > Thanks, > > JM >
-
Re: CleanerChore exceptionJean-Marc Spaggiari 2012-12-30, 17:53
I was going to move to 0.94.4 today ;) And yes I'm using 0.94.3. I
might wait a bit in case some testing is required with my version. Is this what you are looking for? http://pastebin.com/N8Q0FMba I will keep the files for now since it seems it's not causing any major issue. That will allow some more testing if required. JM 2012/12/30, Ted Yu <[EMAIL PROTECTED]>: > Looks like you're using 0.94.3 > > The archiver is backport of: > HBASE-5547, Don't delete HFiles in backup mode > > Can you provide more the log where the IOE was reported using pastebin ? > > Thanks > > On Sun, Dec 30, 2012 at 9:08 AM, Jean-Marc Spaggiari < > [EMAIL PROTECTED]> wrote: > >> Hi, >> >> I have a "IOException" /hbase/.archive/table_name is non empty >> exception every minute on my logs. >> >> There is 30 directories under this directory. the main directory is >> from yesterday, but all sub directories are from December 10th, all >> the same time. >> >> What does this .archive directory is used for, and what should I do? >> >> Thanks, >> >> JM >> >
-
Re: CleanerChore exceptionTed Yu 2012-12-30, 18:23
The exception came from this line:
if (file.isDir()) checkAndDeleteDirectory(file.getPath()); Looking at checkAndDeleteDirectory(), it recursively deletes files and directories under the specified path. Does /hbase/.archive/entry_duplicate only contain empty directories underneath it ? You didn't modify the logcleaner plugin setting, right ? <property> <name>hbase.master.logcleaner.plugins</name> <value>org.apache.hadoop.hbase.master.cleaner.TimeToLiveLogCleaner</value> </property> Cheers On Sun, Dec 30, 2012 at 9:53 AM, Jean-Marc Spaggiari < [EMAIL PROTECTED]> wrote: > I was going to move to 0.94.4 today ;) And yes I'm using 0.94.3. I > might wait a bit in case some testing is required with my version. > > Is this what you are looking for? http://pastebin.com/N8Q0FMba > > I will keep the files for now since it seems it's not causing any > major issue. That will allow some more testing if required. > > JM > > > 2012/12/30, Ted Yu <[EMAIL PROTECTED]>: > > Looks like you're using 0.94.3 > > > > The archiver is backport of: > > HBASE-5547, Don't delete HFiles in backup mode > > > > Can you provide more the log where the IOE was reported using pastebin ? > > > > Thanks > > > > On Sun, Dec 30, 2012 at 9:08 AM, Jean-Marc Spaggiari < > > [EMAIL PROTECTED]> wrote: > > > >> Hi, > >> > >> I have a "IOException" /hbase/.archive/table_name is non empty > >> exception every minute on my logs. > >> > >> There is 30 directories under this directory. the main directory is > >> from yesterday, but all sub directories are from December 10th, all > >> the same time. > >> > >> What does this .archive directory is used for, and what should I do? > >> > >> Thanks, > >> > >> JM > >> > > >
-
Re: CleanerChore exceptionJean-Marc Spaggiari 2012-12-30, 18:37
Regargind the logcleaner settings, I have not changed anything. It's
what came with the initial install. So I don't have anything setup for this plugin in my configuration files. For the files on the FS, here is what I have: hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -ls /hbase/.archive/entry_duplicate Found 30 items drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/00c185bc44b6dcf85a90b83bdda4ec2e drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/0ddf0d1802c6afd97d032fd09ea9e37d drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/18cf7c5c946ddf33e49b227feedfb688 drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/2353f10e79dacc5cf201be6a1eb63607 drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 /hbase/.archive/entry_duplicate/243f4007cf05415062010a5650598bff drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 /hbase/.archive/entry_duplicate/287682333698e36cea1670f5479fbf18 drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/3742da9bd798342e638e1ce341f27537 drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 /hbase/.archive/entry_duplicate/435c9c08bc08ed7248a013b6ffaa163b drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/45346b4b4248d77d45e031ea71a1fb63 drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/4afe48fe6d8defe569f8632dd2514b07 drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/68a4e364fe791a0d1f47febbb41e8112 drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/7673d718962535c7b54cef51830f22a5 drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 /hbase/.archive/entry_duplicate/7df6845ae9d052f4eae4a01e39313d61 drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/8c5a263167d1b09f645af8efb4545554 drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/8c98d9c635ba30d467d127a2ec1c69f8 drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/8dfa96393e18ecca826fd9200e6bf68b drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/8e8f532e91a7197cd53b7626130be698 drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/8eca1a325fe442a8546e43ac2f00cfef drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/9ad4c0551b90ea7717d7e3aaec76dc26 drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/a135ccbc6f61ce544dbd537dc12489e9 drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/a3d0332a6d51a8b15b99d1caca3f355a drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/bd2b8c942af27e541e20e430d506d2c0 drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/c10c3a66948bde75fc41349108d86cf9 drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 /hbase/.archive/entry_duplicate/cbf2f178691bfca8a7e9825115629b8e drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/d14a2546eaceede73b282e444ad1bb40 drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 /hbase/.archive/entry_duplicate/d570a21a39e04ba2ec896bbe7166423c drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/e943bda56acd6beb35bdd56f0560f87f drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/ef5692ba83aba48d9e7a6b9c2cd0661e drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/fd85dd319c289959a790faed32ef1530 drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 /hbase/.archive/entry_duplicate/ffcdf6554accda1800e74838b67d3004 hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -ls /hbase/.archive/entry_duplicate/00c185bc44b6dcf85a90b83bdda4ec2e hadoop@node3:~/hadoop-1.0.3$ I have not lookeqd into ALL the subdirectories, but the 10 first are empty. I see that there is some traces on checkAndDeleteDirectory... I will try to activate that and see if there is more details. JM 2012/12/30, Ted Yu <[EMAIL PROTECTED]>:
-
Re: CleanerChore exceptionJean-Marc Spaggiari 2012-12-30, 18:59
So. Looking deeper I found few things.
First, why checkAndDeleteDirectory is not "simply" calling FSUtils.delete (fs, toCheck, true)? I guess it's doing the same thing? Also, FSUtils.listStatus(fs, toCheck, null); will return null if there is no status. Not just an empty array. And it's returning null, we will exit without calling the delete methode. I tried to manually create a file on one of those directories. The exception disapears for 300 seconds because of the TTL for the newly created file. After 300 seconds, the file I pushed AND the directory got removed. So the issue is really with empty directories. I will take a look at what is in the trunk and in 0.94.4 to see if it's the same issue. But I think we can simple change all this code by a call to FSUtils.delete. I can open a JIRA and submit a patch for that. Just let me know. JM 2012/12/30, Jean-Marc Spaggiari <[EMAIL PROTECTED]>: > Regargind the logcleaner settings, I have not changed anything. It's > what came with the initial install. So I don't have anything setup for > this plugin in my configuration files. > > For the files on the FS, here is what I have: > hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -ls > /hbase/.archive/entry_duplicate > Found 30 items > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > /hbase/.archive/entry_duplicate/00c185bc44b6dcf85a90b83bdda4ec2e > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > /hbase/.archive/entry_duplicate/0ddf0d1802c6afd97d032fd09ea9e37d > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > /hbase/.archive/entry_duplicate/18cf7c5c946ddf33e49b227feedfb688 > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > /hbase/.archive/entry_duplicate/2353f10e79dacc5cf201be6a1eb63607 > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 > /hbase/.archive/entry_duplicate/243f4007cf05415062010a5650598bff > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 > /hbase/.archive/entry_duplicate/287682333698e36cea1670f5479fbf18 > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > /hbase/.archive/entry_duplicate/3742da9bd798342e638e1ce341f27537 > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 > /hbase/.archive/entry_duplicate/435c9c08bc08ed7248a013b6ffaa163b > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > /hbase/.archive/entry_duplicate/45346b4b4248d77d45e031ea71a1fb63 > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > /hbase/.archive/entry_duplicate/4afe48fe6d8defe569f8632dd2514b07 > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > /hbase/.archive/entry_duplicate/68a4e364fe791a0d1f47febbb41e8112 > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > /hbase/.archive/entry_duplicate/7673d718962535c7b54cef51830f22a5 > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 > /hbase/.archive/entry_duplicate/7df6845ae9d052f4eae4a01e39313d61 > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > /hbase/.archive/entry_duplicate/8c5a263167d1b09f645af8efb4545554 > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > /hbase/.archive/entry_duplicate/8c98d9c635ba30d467d127a2ec1c69f8 > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > /hbase/.archive/entry_duplicate/8dfa96393e18ecca826fd9200e6bf68b > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > /hbase/.archive/entry_duplicate/8e8f532e91a7197cd53b7626130be698 > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > /hbase/.archive/entry_duplicate/8eca1a325fe442a8546e43ac2f00cfef > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > /hbase/.archive/entry_duplicate/9ad4c0551b90ea7717d7e3aaec76dc26 > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > /hbase/.archive/entry_duplicate/a135ccbc6f61ce544dbd537dc12489e9 > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > /hbase/.archive/entry_duplicate/a3d0332a6d51a8b15b99d1caca3f355a > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39
-
Re: CleanerChore exceptionTed Yu 2012-12-30, 19:11
Thanks for the digging. This concurs with my suspicion in the beginning.
I am copying Jesse who wrote the code. He should have more insight on this. After his confirmation, you can log a JIRA. Cheers On Sun, Dec 30, 2012 at 10:59 AM, Jean-Marc Spaggiari < [EMAIL PROTECTED]> wrote: > So. Looking deeper I found few things. > > First, why checkAndDeleteDirectory is not "simply" calling > FSUtils.delete (fs, toCheck, true)? I guess it's doing the same thing? > > Also, FSUtils.listStatus(fs, toCheck, null); will return null if there > is no status. Not just an empty array. And it's returning null, we > will exit without calling the delete methode. > > I tried to manually create a file on one of those directories. The > exception disapears for 300 seconds because of the TTL for the newly > created file. After 300 seconds, the file I pushed AND the directory > got removed. So the issue is really with empty directories. > > I will take a look at what is in the trunk and in 0.94.4 to see if > it's the same issue. But I think we can simple change all this code by > a call to FSUtils.delete. > > I can open a JIRA and submit a patch for that. Just let me know. > > JM > > 2012/12/30, Jean-Marc Spaggiari <[EMAIL PROTECTED]>: > > Regargind the logcleaner settings, I have not changed anything. It's > > what came with the initial install. So I don't have anything setup for > > this plugin in my configuration files. > > > > For the files on the FS, here is what I have: > > hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -ls > > /hbase/.archive/entry_duplicate > > Found 30 items > > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > > /hbase/.archive/entry_duplicate/00c185bc44b6dcf85a90b83bdda4ec2e > > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > > /hbase/.archive/entry_duplicate/0ddf0d1802c6afd97d032fd09ea9e37d > > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > > /hbase/.archive/entry_duplicate/18cf7c5c946ddf33e49b227feedfb688 > > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > > /hbase/.archive/entry_duplicate/2353f10e79dacc5cf201be6a1eb63607 > > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 > > /hbase/.archive/entry_duplicate/243f4007cf05415062010a5650598bff > > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 > > /hbase/.archive/entry_duplicate/287682333698e36cea1670f5479fbf18 > > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > > /hbase/.archive/entry_duplicate/3742da9bd798342e638e1ce341f27537 > > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 > > /hbase/.archive/entry_duplicate/435c9c08bc08ed7248a013b6ffaa163b > > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > > /hbase/.archive/entry_duplicate/45346b4b4248d77d45e031ea71a1fb63 > > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > > /hbase/.archive/entry_duplicate/4afe48fe6d8defe569f8632dd2514b07 > > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > > /hbase/.archive/entry_duplicate/68a4e364fe791a0d1f47febbb41e8112 > > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > > /hbase/.archive/entry_duplicate/7673d718962535c7b54cef51830f22a5 > > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 > > /hbase/.archive/entry_duplicate/7df6845ae9d052f4eae4a01e39313d61 > > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > > /hbase/.archive/entry_duplicate/8c5a263167d1b09f645af8efb4545554 > > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > > /hbase/.archive/entry_duplicate/8c98d9c635ba30d467d127a2ec1c69f8 > > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > > /hbase/.archive/entry_duplicate/8dfa96393e18ecca826fd9200e6bf68b > > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > > /hbase/.archive/entry_duplicate/8e8f532e91a7197cd53b7626130be698 > > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 > > /hbase/.archive/entry_duplicate/8eca1a325fe442a8546e43ac2f00cfef > > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39
-
Re: CleanerChore exceptionTed Yu 2012-12-30, 19:21
Looking at this line in checkAndDeleteDirectory():
return canDeleteThis ? fs.delete(toCheck, false) : false; If fs.delete() returns false, meaning the deletion was unsuccessful, the parent directory tree wouldn't be deleted. I think this is inconsistent with the javadoc for checkAndDeleteDirectory(): * @throws IOException if there is an unexpected filesystem error We should either throw IOE in that case, or try deleting the sub-directory by specifying true as the third argument for delete(). Cheers On Sun, Dec 30, 2012 at 11:11 AM, Ted Yu <[EMAIL PROTECTED]> wrote: > Thanks for the digging. This concurs with my suspicion in the beginning. > > I am copying Jesse who wrote the code. He should have more insight on this. > > After his confirmation, you can log a JIRA. > > Cheers > > > On Sun, Dec 30, 2012 at 10:59 AM, Jean-Marc Spaggiari < > [EMAIL PROTECTED]> wrote: > >> So. Looking deeper I found few things. >> >> First, why checkAndDeleteDirectory is not "simply" calling >> FSUtils.delete (fs, toCheck, true)? I guess it's doing the same thing? >> >> Also, FSUtils.listStatus(fs, toCheck, null); will return null if there >> is no status. Not just an empty array. And it's returning null, we >> will exit without calling the delete methode. >> >> I tried to manually create a file on one of those directories. The >> exception disapears for 300 seconds because of the TTL for the newly >> created file. After 300 seconds, the file I pushed AND the directory >> got removed. So the issue is really with empty directories. >> >> I will take a look at what is in the trunk and in 0.94.4 to see if >> it's the same issue. But I think we can simple change all this code by >> a call to FSUtils.delete. >> >> I can open a JIRA and submit a patch for that. Just let me know. >> >> JM >> >> 2012/12/30, Jean-Marc Spaggiari <[EMAIL PROTECTED]>: >> > Regargind the logcleaner settings, I have not changed anything. It's >> > what came with the initial install. So I don't have anything setup for >> > this plugin in my configuration files. >> > >> > For the files on the FS, here is what I have: >> > hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -ls >> > /hbase/.archive/entry_duplicate >> > Found 30 items >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 >> > /hbase/.archive/entry_duplicate/00c185bc44b6dcf85a90b83bdda4ec2e >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 >> > /hbase/.archive/entry_duplicate/0ddf0d1802c6afd97d032fd09ea9e37d >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 >> > /hbase/.archive/entry_duplicate/18cf7c5c946ddf33e49b227feedfb688 >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 >> > /hbase/.archive/entry_duplicate/2353f10e79dacc5cf201be6a1eb63607 >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 >> > /hbase/.archive/entry_duplicate/243f4007cf05415062010a5650598bff >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 >> > /hbase/.archive/entry_duplicate/287682333698e36cea1670f5479fbf18 >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 >> > /hbase/.archive/entry_duplicate/3742da9bd798342e638e1ce341f27537 >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 >> > /hbase/.archive/entry_duplicate/435c9c08bc08ed7248a013b6ffaa163b >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 >> > /hbase/.archive/entry_duplicate/45346b4b4248d77d45e031ea71a1fb63 >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 >> > /hbase/.archive/entry_duplicate/4afe48fe6d8defe569f8632dd2514b07 >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 >> > /hbase/.archive/entry_duplicate/68a4e364fe791a0d1f47febbb41e8112 >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 >> > /hbase/.archive/entry_duplicate/7673d718962535c7b54cef51830f22a5 >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 >> > /hbase/.archive/entry_duplicate/7df6845ae9d052f4eae4a01e39313d61 >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39
-
Re: CleanerChore exceptionJean-Marc Spaggiari 2012-12-30, 19:25
Thanks for the confirmation.
Also, seems that there is no test class related to checkAndDeleteDirectory. It might be good to add that too. I have extracted 0.94.3 0.94.4RC0 and the trunk and they are all identical for this methode. I will try to do some modifications and see the results... So far there is 2 options. One is to change the "return null" to handle the current empty directory, and another one is to call fs.delete() directly from checkAndDeleteDirectory instead of the existing code. Will wait for Jesse's feedback. JM 2012/12/30, Ted Yu <[EMAIL PROTECTED]>: > Thanks for the digging. This concurs with my suspicion in the beginning. > > I am copying Jesse who wrote the code. He should have more insight on this. > > After his confirmation, you can log a JIRA. > > Cheers > > On Sun, Dec 30, 2012 at 10:59 AM, Jean-Marc Spaggiari < > [EMAIL PROTECTED]> wrote: > >> So. Looking deeper I found few things. >> >> First, why checkAndDeleteDirectory is not "simply" calling >> FSUtils.delete (fs, toCheck, true)? I guess it's doing the same thing? >> >> Also, FSUtils.listStatus(fs, toCheck, null); will return null if there >> is no status. Not just an empty array. And it's returning null, we >> will exit without calling the delete methode. >> >> I tried to manually create a file on one of those directories. The >> exception disapears for 300 seconds because of the TTL for the newly >> created file. After 300 seconds, the file I pushed AND the directory >> got removed. So the issue is really with empty directories. >> >> I will take a look at what is in the trunk and in 0.94.4 to see if >> it's the same issue. But I think we can simple change all this code by >> a call to FSUtils.delete. >> >> I can open a JIRA and submit a patch for that. Just let me know. >> >> JM >> >> 2012/12/30, Jean-Marc Spaggiari <[EMAIL PROTECTED]>: >> > Regargind the logcleaner settings, I have not changed anything. It's >> > what came with the initial install. So I don't have anything setup for >> > this plugin in my configuration files. >> > >> > For the files on the FS, here is what I have: >> > hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -ls >> > /hbase/.archive/entry_duplicate >> > Found 30 items >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 >> > /hbase/.archive/entry_duplicate/00c185bc44b6dcf85a90b83bdda4ec2e >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 >> > /hbase/.archive/entry_duplicate/0ddf0d1802c6afd97d032fd09ea9e37d >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 >> > /hbase/.archive/entry_duplicate/18cf7c5c946ddf33e49b227feedfb688 >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 >> > /hbase/.archive/entry_duplicate/2353f10e79dacc5cf201be6a1eb63607 >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 >> > /hbase/.archive/entry_duplicate/243f4007cf05415062010a5650598bff >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 >> > /hbase/.archive/entry_duplicate/287682333698e36cea1670f5479fbf18 >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 >> > /hbase/.archive/entry_duplicate/3742da9bd798342e638e1ce341f27537 >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 >> > /hbase/.archive/entry_duplicate/435c9c08bc08ed7248a013b6ffaa163b >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 >> > /hbase/.archive/entry_duplicate/45346b4b4248d77d45e031ea71a1fb63 >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 >> > /hbase/.archive/entry_duplicate/4afe48fe6d8defe569f8632dd2514b07 >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 >> > /hbase/.archive/entry_duplicate/68a4e364fe791a0d1f47febbb41e8112 >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 >> > /hbase/.archive/entry_duplicate/7673d718962535c7b54cef51830f22a5 >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:38 >> > /hbase/.archive/entry_duplicate/7df6845ae9d052f4eae4a01e39313d61 >> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39
-
Re: CleanerChore exceptionJean-Marc Spaggiari 2012-12-30, 19:50
The Javadoc is saying:
"@return <tt>true</tt> if the directory was deleted, <tt>false</tt> otherwise" So I think the line "return canDeleteThis ? fs.delete(toCheck, false) : false;" is still correct. It's retuning false if the directory has not been deleted. There is no exception here. If the TTL for a file had not expired, the file can't be deleted and false is returned. I think it's correct behaviour. The idea of not passing "true" for the recursivity is explained on the comments: // if all the children have been deleted, then we should try to delete this directory. However, // don't do so recursively so we don't delete files that have been added since we checked. And I think it's good. So the issue is really when the directory is empty and listStatus is sending back null. Then if (children == null) return true; is simply returning true without deleting the current directory. This should be changed by something like if (children == null) return fs.delete(toCheck, false); Which will try to delete the current directory, return true or false if possible or not, and throw an expection if there is any issue with the FS... I have done some modifications. I'm compiling and will deploy the updated version on my local cluster soon. I will keep you posted on the result. JM 2012/12/30, Jean-Marc Spaggiari <[EMAIL PROTECTED]>: > Thanks for the confirmation. > > Also, seems that there is no test class related to > checkAndDeleteDirectory. It might be good to add that too. > > I have extracted 0.94.3 0.94.4RC0 and the trunk and they are all > identical for this methode. > > I will try to do some modifications and see the results... > > So far there is 2 options. One is to change the "return null" to > handle the current empty directory, and another one is to call > fs.delete() directly from checkAndDeleteDirectory instead of the > existing code. > > Will wait for Jesse's feedback. > > JM > > 2012/12/30, Ted Yu <[EMAIL PROTECTED]>: >> Thanks for the digging. This concurs with my suspicion in the beginning. >> >> I am copying Jesse who wrote the code. He should have more insight on >> this. >> >> After his confirmation, you can log a JIRA. >> >> Cheers >> >> On Sun, Dec 30, 2012 at 10:59 AM, Jean-Marc Spaggiari < >> [EMAIL PROTECTED]> wrote: >> >>> So. Looking deeper I found few things. >>> >>> First, why checkAndDeleteDirectory is not "simply" calling >>> FSUtils.delete (fs, toCheck, true)? I guess it's doing the same thing? >>> >>> Also, FSUtils.listStatus(fs, toCheck, null); will return null if there >>> is no status. Not just an empty array. And it's returning null, we >>> will exit without calling the delete methode. >>> >>> I tried to manually create a file on one of those directories. The >>> exception disapears for 300 seconds because of the TTL for the newly >>> created file. After 300 seconds, the file I pushed AND the directory >>> got removed. So the issue is really with empty directories. >>> >>> I will take a look at what is in the trunk and in 0.94.4 to see if >>> it's the same issue. But I think we can simple change all this code by >>> a call to FSUtils.delete. >>> >>> I can open a JIRA and submit a patch for that. Just let me know. >>> >>> JM >>> >>> 2012/12/30, Jean-Marc Spaggiari <[EMAIL PROTECTED]>: >>> > Regargind the logcleaner settings, I have not changed anything. It's >>> > what came with the initial install. So I don't have anything setup for >>> > this plugin in my configuration files. >>> > >>> > For the files on the FS, here is what I have: >>> > hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -ls >>> > /hbase/.archive/entry_duplicate >>> > Found 30 items >>> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 >>> > /hbase/.archive/entry_duplicate/00c185bc44b6dcf85a90b83bdda4ec2e >>> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39 >>> > /hbase/.archive/entry_duplicate/0ddf0d1802c6afd97d032fd09ea9e37d >>> > drwxr-xr-x - hbase supergroup 0 2012-12-10 14:39
-
Re: CleanerChore exceptionlars hofhansl 2012-12-30, 21:33
Nothing has changed around this in 0.94.4 as far as I know.
________________________________ From: Jean-Marc Spaggiari <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Sunday, December 30, 2012 9:53 AM Subject: Re: CleanerChore exception I was going to move to 0.94.4 today ;) And yes I'm using 0.94.3. I might wait a bit in case some testing is required with my version. Is this what you are looking for? http://pastebin.com/N8Q0FMba I will keep the files for now since it seems it's not causing any major issue. That will allow some more testing if required. JM 2012/12/30, Ted Yu <[EMAIL PROTECTED]>: > Looks like you're using 0.94.3 > > The archiver is backport of: > HBASE-5547, Don't delete HFiles in backup mode > > Can you provide more the log where the IOE was reported using pastebin ? > > Thanks > > On Sun, Dec 30, 2012 at 9:08 AM, Jean-Marc Spaggiari < > [EMAIL PROTECTED]> wrote: > >> Hi, >> >> I have a "IOException" /hbase/.archive/table_name is non empty >> exception every minute on my logs. >> >> There is 30 directories under this directory. the main directory is >> from yesterday, but all sub directories are from December 10th, all >> the same time. >> >> What does this .archive directory is used for, and what should I do? >> >> Thanks, >> >> JM >> >
-
Re: CleanerChore exceptionJean-Marc Spaggiari 2012-12-30, 22:15
I did the change, pushed it and it cleaned my directories correctly.
// if the directory doesn't exist or is empty, then we are done if (children == null) return fs.delete(toCheck, false); The only thing is that I don't know what will fs.delete() return i case the directory doesn't exist. But I think it's still correct to return false if the directory doesn't exist because we can't really delete something which doesn't exist... My opinion. So the patch is ready, easy one ;) Just waiting for Jesse's feedback just in case. JM 2012/12/30, lars hofhansl <[EMAIL PROTECTED]>: > Nothing has changed around this in 0.94.4 as far as I know. > > > > > ________________________________ > From: Jean-Marc Spaggiari <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Sent: Sunday, December 30, 2012 9:53 AM > Subject: Re: CleanerChore exception > > I was going to move to 0.94.4 today ;) And yes I'm using 0.94.3. I > might wait a bit in case some testing is required with my version. > > Is this what you are looking for? http://pastebin.com/N8Q0FMba > > I will keep the files for now since it seems it's not causing any > major issue. That will allow some more testing if required. > > JM > > > 2012/12/30, Ted Yu <[EMAIL PROTECTED]>: >> Looks like you're using 0.94.3 >> >> The archiver is backport of: >> HBASE-5547, Don't delete HFiles in backup mode >> >> Can you provide more the log where the IOE was reported using pastebin ? >> >> Thanks >> >> On Sun, Dec 30, 2012 at 9:08 AM, Jean-Marc Spaggiari < >> [EMAIL PROTECTED]> wrote: >> >>> Hi, >>> >>> I have a "IOException" /hbase/.archive/table_name is non empty >>> exception every minute on my logs. >>> >>> There is 30 directories under this directory. the main directory is >>> from yesterday, but all sub directories are from December 10th, all >>> the same time. >>> >>> What does this .archive directory is used for, and what should I do? >>> >>> Thanks, >>> >>> JM >>> >>
-
Re: CleanerChore exceptionTed 2012-12-30, 22:26
Thanks for your digging.
Minor optimization would be to issue delete() on the parent directory so that there are fewer requests to namenode. Cheers On Dec 30, 2012, at 2:15 PM, Jean-Marc Spaggiari <[EMAIL PROTECTED]> wrote: > I did the change, pushed it and it cleaned my directories correctly. > > // if the directory doesn't exist or is empty, then we are done > if (children == null) return fs.delete(toCheck, false); > > The only thing is that I don't know what will fs.delete() return i > case the directory doesn't exist. But I think it's still correct to > return false if the directory doesn't exist because we can't really > delete something which doesn't exist... > > My opinion. > > So the patch is ready, easy one ;) Just waiting for Jesse's feedback > just in case. > > JM > > 2012/12/30, lars hofhansl <[EMAIL PROTECTED]>: >> Nothing has changed around this in 0.94.4 as far as I know. >> >> >> >> >> ________________________________ >> From: Jean-Marc Spaggiari <[EMAIL PROTECTED]> >> To: [EMAIL PROTECTED] >> Sent: Sunday, December 30, 2012 9:53 AM >> Subject: Re: CleanerChore exception >> >> I was going to move to 0.94.4 today ;) And yes I'm using 0.94.3. I >> might wait a bit in case some testing is required with my version. >> >> Is this what you are looking for? http://pastebin.com/N8Q0FMba >> >> I will keep the files for now since it seems it's not causing any >> major issue. That will allow some more testing if required. >> >> JM >> >> >> 2012/12/30, Ted Yu <[EMAIL PROTECTED]>: >>> Looks like you're using 0.94.3 >>> >>> The archiver is backport of: >>> HBASE-5547, Don't delete HFiles in backup mode >>> >>> Can you provide more the log where the IOE was reported using pastebin ? >>> >>> Thanks >>> >>> On Sun, Dec 30, 2012 at 9:08 AM, Jean-Marc Spaggiari < >>> [EMAIL PROTECTED]> wrote: >>> >>>> Hi, >>>> >>>> I have a "IOException" /hbase/.archive/table_name is non empty >>>> exception every minute on my logs. >>>> >>>> There is 30 directories under this directory. the main directory is >>>> from yesterday, but all sub directories are from December 10th, all >>>> the same time. >>>> >>>> What does this .archive directory is used for, and what should I do? >>>> >>>> Thanks, >>>> >>>> JM >>>
-
Re: CleanerChore exceptionJean-Marc Spaggiari 2012-12-30, 22:42
I'm not sure I'm getting that.
It's recursive. So when you are on the parent directory, you don't know yet if the child directory is empty or not. So you can't call the delete() yet. If you call the delet() giving "true" for recurs, then you might delete some files who just got created, which we want to avoid. IMHO. 2012/12/30, Ted <[EMAIL PROTECTED]>: > Thanks for your digging. > > Minor optimization would be to issue delete() on the parent directory so > that there are fewer requests to namenode. > > Cheers > > On Dec 30, 2012, at 2:15 PM, Jean-Marc Spaggiari <[EMAIL PROTECTED]> > wrote: > >> I did the change, pushed it and it cleaned my directories correctly. >> >> // if the directory doesn't exist or is empty, then we are done >> if (children == null) return fs.delete(toCheck, false); >> >> The only thing is that I don't know what will fs.delete() return i >> case the directory doesn't exist. But I think it's still correct to >> return false if the directory doesn't exist because we can't really >> delete something which doesn't exist... >> >> My opinion. >> >> So the patch is ready, easy one ;) Just waiting for Jesse's feedback >> just in case. >> >> JM >> >> 2012/12/30, lars hofhansl <[EMAIL PROTECTED]>: >>> Nothing has changed around this in 0.94.4 as far as I know. >>> >>> >>> >>> >>> ________________________________ >>> From: Jean-Marc Spaggiari <[EMAIL PROTECTED]> >>> To: [EMAIL PROTECTED] >>> Sent: Sunday, December 30, 2012 9:53 AM >>> Subject: Re: CleanerChore exception >>> >>> I was going to move to 0.94.4 today ;) And yes I'm using 0.94.3. I >>> might wait a bit in case some testing is required with my version. >>> >>> Is this what you are looking for? http://pastebin.com/N8Q0FMba >>> >>> I will keep the files for now since it seems it's not causing any >>> major issue. That will allow some more testing if required. >>> >>> JM >>> >>> >>> 2012/12/30, Ted Yu <[EMAIL PROTECTED]>: >>>> Looks like you're using 0.94.3 >>>> >>>> The archiver is backport of: >>>> HBASE-5547, Don't delete HFiles in backup mode >>>> >>>> Can you provide more the log where the IOE was reported using pastebin >>>> ? >>>> >>>> Thanks >>>> >>>> On Sun, Dec 30, 2012 at 9:08 AM, Jean-Marc Spaggiari < >>>> [EMAIL PROTECTED]> wrote: >>>> >>>>> Hi, >>>>> >>>>> I have a "IOException" /hbase/.archive/table_name is non empty >>>>> exception every minute on my logs. >>>>> >>>>> There is 30 directories under this directory. the main directory is >>>>> from yesterday, but all sub directories are from December 10th, all >>>>> the same time. >>>>> >>>>> What does this .archive directory is used for, and what should I do? >>>>> >>>>> Thanks, >>>>> >>>>> JM >>>> >
-
Re: CleanerChore exceptionJesse Yates 2012-12-31, 00:13
Hey,
So the point of all the delete code in the cleaner is to try and delete each of the files in the directory and then delete the directory, assuming its empty- it shouldn't leak the IOException if it the directory is found to be empty and then gets a file added. This is really odd though, as failures should return false, not throw an exception (boo HDFS javadocs). Looking at the 0.94 and 0.96 code, it its just logged, which it annoying, but doesn't mean broken code. Otherwise, Jean-Marc's analysis looks right. Should be a simple fix. I filed HBASE-7465 and should have a patch up shortly. As an aside, this method is actually tested (in a somewhat roundabout way) in TestCleanerChore#testCleanerDoesNotDeleteDirectoryWithLateAddedFiles with a spy object that ensures we get this non-error case. -Jesse ------------------- Jesse Yates @jesse_yates jyates.github.com On Sun, Dec 30, 2012 at 11:50 AM, Jean-Marc Spaggiari < [EMAIL PROTECTED]> wrote: > The Javadoc is saying: > > "@return <tt>true</tt> if the directory was deleted, <tt>false</tt> > otherwise" > > So I think the line "return canDeleteThis ? fs.delete(toCheck, false) > : false;" is still correct. It's retuning false if the directory has > not been deleted. > > There is no exception here. If the TTL for a file had not expired, the > file can't be deleted and false is returned. I think it's correct > behaviour. > > The idea of not passing "true" for the recursivity is explained on the > comments: > // if all the children have been deleted, then we should try to > delete this directory. However, > // don't do so recursively so we don't delete files that have been > added since we checked. > And I think it's good. So the issue is really when the directory is > empty and listStatus is sending back null. Then if (children == null) > return true; is simply returning true without deleting the current > directory. > > This should be changed by something like > if (children == null) return fs.delete(toCheck, false); > Which will try to delete the current directory, return true or false > if possible or not, and throw an expection if there is any issue with > the FS... > > I have done some modifications. I'm compiling and will deploy the > updated version on my local cluster soon. I will keep you posted on > the result. > > JM > > 2012/12/30, Jean-Marc Spaggiari <[EMAIL PROTECTED]>: > > Thanks for the confirmation. > > > > Also, seems that there is no test class related to > > checkAndDeleteDirectory. It might be good to add that too. > > > > I have extracted 0.94.3 0.94.4RC0 and the trunk and they are all > > identical for this methode. > > > > I will try to do some modifications and see the results... > > > > So far there is 2 options. One is to change the "return null" to > > handle the current empty directory, and another one is to call > > fs.delete() directly from checkAndDeleteDirectory instead of the > > existing code. > > > > Will wait for Jesse's feedback. > > > > JM > > > > 2012/12/30, Ted Yu <[EMAIL PROTECTED]>: > >> Thanks for the digging. This concurs with my suspicion in the beginning. > >> > >> I am copying Jesse who wrote the code. He should have more insight on > >> this. > >> > >> After his confirmation, you can log a JIRA. > >> > >> Cheers > >> > >> On Sun, Dec 30, 2012 at 10:59 AM, Jean-Marc Spaggiari < > >> [EMAIL PROTECTED]> wrote: > >> > >>> So. Looking deeper I found few things. > >>> > >>> First, why checkAndDeleteDirectory is not "simply" calling > >>> FSUtils.delete (fs, toCheck, true)? I guess it's doing the same thing? > >>> > >>> Also, FSUtils.listStatus(fs, toCheck, null); will return null if there > >>> is no status. Not just an empty array. And it's returning null, we > >>> will exit without calling the delete methode. > >>> > >>> I tried to manually create a file on one of those directories. The > >>> exception disapears for 300 seconds because of the TTL for the newly > >>> created file. After 300 seconds, the file I pushed AND the directory
-
Re: CleanerChore exceptionTed 2012-12-31, 00:22
I am not at a computer for the moment.
This involves passing an extra parameter indicating the level of recursion and possibly using an enum in place of the Boolean return value. I will show you a patch when I get home. Thanks On Dec 30, 2012, at 2:42 PM, Jean-Marc Spaggiari <[EMAIL PROTECTED]> wrote: > I'm not sure I'm getting that. > > It's recursive. So when you are on the parent directory, you don't > know yet if the child directory is empty or not. So you can't call the > delete() yet. If you call the delet() giving "true" for recurs, then > you might delete some files who just got created, which we want to > avoid. > > IMHO. > > 2012/12/30, Ted <[EMAIL PROTECTED]>: >> Thanks for your digging. >> >> Minor optimization would be to issue delete() on the parent directory so >> that there are fewer requests to namenode. >> >> Cheers >> >> On Dec 30, 2012, at 2:15 PM, Jean-Marc Spaggiari <[EMAIL PROTECTED]> >> wrote: >> >>> I did the change, pushed it and it cleaned my directories correctly. >>> >>> // if the directory doesn't exist or is empty, then we are done >>> if (children == null) return fs.delete(toCheck, false); >>> >>> The only thing is that I don't know what will fs.delete() return i >>> case the directory doesn't exist. But I think it's still correct to >>> return false if the directory doesn't exist because we can't really >>> delete something which doesn't exist... >>> >>> My opinion. >>> >>> So the patch is ready, easy one ;) Just waiting for Jesse's feedback >>> just in case. >>> >>> JM >>> >>> 2012/12/30, lars hofhansl <[EMAIL PROTECTED]>: >>>> Nothing has changed around this in 0.94.4 as far as I know. >>>> >>>> >>>> >>>> >>>> ________________________________ >>>> From: Jean-Marc Spaggiari <[EMAIL PROTECTED]> >>>> To: [EMAIL PROTECTED] >>>> Sent: Sunday, December 30, 2012 9:53 AM >>>> Subject: Re: CleanerChore exception >>>> >>>> I was going to move to 0.94.4 today ;) And yes I'm using 0.94.3. I >>>> might wait a bit in case some testing is required with my version. >>>> >>>> Is this what you are looking for? http://pastebin.com/N8Q0FMba >>>> >>>> I will keep the files for now since it seems it's not causing any >>>> major issue. That will allow some more testing if required. >>>> >>>> JM >>>> >>>> >>>> 2012/12/30, Ted Yu <[EMAIL PROTECTED]>: >>>>> Looks like you're using 0.94.3 >>>>> >>>>> The archiver is backport of: >>>>> HBASE-5547, Don't delete HFiles in backup mode >>>>> >>>>> Can you provide more the log where the IOE was reported using pastebin >>>>> ? >>>>> >>>>> Thanks >>>>> >>>>> On Sun, Dec 30, 2012 at 9:08 AM, Jean-Marc Spaggiari < >>>>> [EMAIL PROTECTED]> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I have a "IOException" /hbase/.archive/table_name is non empty >>>>>> exception every minute on my logs. >>>>>> >>>>>> There is 30 directories under this directory. the main directory is >>>>>> from yesterday, but all sub directories are from December 10th, all >>>>>> the same time. >>>>>> >>>>>> What does this .archive directory is used for, and what should I do? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> JM >>
-
Re: CleanerChore exceptionTed 2012-12-31, 00:29
Jean-Marc:
Can you confirm that the Jira Jesse logged reflects your case ? Thanks On Dec 30, 2012, at 4:13 PM, Jesse Yates <[EMAIL PROTECTED]> wrote: > Hey, > > So the point of all the delete code in the cleaner is to try and delete > each of the files in the directory and then delete the directory, assuming > its empty- it shouldn't leak the IOException if it the directory is found > to be empty and then gets a file added. > > This is really odd though, as failures should return false, not throw an > exception (boo HDFS javadocs). Looking at the 0.94 and 0.96 code, it its > just logged, which it annoying, but doesn't mean broken code. > > Otherwise, Jean-Marc's analysis looks right. Should be a simple fix. I > filed HBASE-7465 and should have a patch up shortly. > > As an aside, this method is actually tested (in a somewhat roundabout way) > in TestCleanerChore#testCleanerDoesNotDeleteDirectoryWithLateAddedFiles > with a spy object that ensures we get this non-error case. > > -Jesse > ------------------- > Jesse Yates > @jesse_yates > jyates.github.com > > > On Sun, Dec 30, 2012 at 11:50 AM, Jean-Marc Spaggiari < > [EMAIL PROTECTED]> wrote: > >> The Javadoc is saying: >> >> "@return <tt>true</tt> if the directory was deleted, <tt>false</tt> >> otherwise" >> >> So I think the line "return canDeleteThis ? fs.delete(toCheck, false) >> : false;" is still correct. It's retuning false if the directory has >> not been deleted. >> >> There is no exception here. If the TTL for a file had not expired, the >> file can't be deleted and false is returned. I think it's correct >> behaviour. >> >> The idea of not passing "true" for the recursivity is explained on the >> comments: >> // if all the children have been deleted, then we should try to >> delete this directory. However, >> // don't do so recursively so we don't delete files that have been >> added since we checked. >> And I think it's good. So the issue is really when the directory is >> empty and listStatus is sending back null. Then if (children == null) >> return true; is simply returning true without deleting the current >> directory. >> >> This should be changed by something like >> if (children == null) return fs.delete(toCheck, false); >> Which will try to delete the current directory, return true or false >> if possible or not, and throw an expection if there is any issue with >> the FS... >> >> I have done some modifications. I'm compiling and will deploy the >> updated version on my local cluster soon. I will keep you posted on >> the result. >> >> JM >> >> 2012/12/30, Jean-Marc Spaggiari <[EMAIL PROTECTED]>: >>> Thanks for the confirmation. >>> >>> Also, seems that there is no test class related to >>> checkAndDeleteDirectory. It might be good to add that too. >>> >>> I have extracted 0.94.3 0.94.4RC0 and the trunk and they are all >>> identical for this methode. >>> >>> I will try to do some modifications and see the results... >>> >>> So far there is 2 options. One is to change the "return null" to >>> handle the current empty directory, and another one is to call >>> fs.delete() directly from checkAndDeleteDirectory instead of the >>> existing code. >>> >>> Will wait for Jesse's feedback. >>> >>> JM >>> >>> 2012/12/30, Ted Yu <[EMAIL PROTECTED]>: >>>> Thanks for the digging. This concurs with my suspicion in the beginning. >>>> >>>> I am copying Jesse who wrote the code. He should have more insight on >>>> this. >>>> >>>> After his confirmation, you can log a JIRA. >>>> >>>> Cheers >>>> >>>> On Sun, Dec 30, 2012 at 10:59 AM, Jean-Marc Spaggiari < >>>> [EMAIL PROTECTED]> wrote: >>>> >>>>> So. Looking deeper I found few things. >>>>> >>>>> First, why checkAndDeleteDirectory is not "simply" calling >>>>> FSUtils.delete (fs, toCheck, true)? I guess it's doing the same thing? >>>>> >>>>> Also, FSUtils.listStatus(fs, toCheck, null); will return null if there >>>>> is no status. Not just an empty array. And it's returning null, we |