|
Noah Watkins
2011-11-02, 22:57
Eli Collins
2011-11-02, 23:25
Noah Watkins
2011-11-02, 23:57
Ted Dunning
2011-11-03, 00:14
Harsh J
2011-11-03, 02:20
Uma Maheswara Rao G 72686...
2011-11-03, 11:27
M. C. Srivas
2011-11-05, 21:41
Uma Maheswara Rao G 72686...
2011-11-06, 02:51
Alejandro Abdelnur
2011-11-07, 19:15
|
-
FileSystem contract of listStatusNoah Watkins 2011-11-02, 22:57
I have a question about the FileSystem contract in 0.20.
In FileSystemContractBaseBaseTest:testFileStatus() there are several files created, and afterwards the test confirms that they are present. Here is the relevant code: FileStatus[] paths = fs.listStatus(path("/test")); paths = fs.listStatus(path("/test/hadoop")); assertEquals(3, paths.length); assertEquals(path("/test/hadoop/a"), paths[0].getPath()); assertEquals(path("/test/hadoop/b"), paths[1].getPath()); assertEquals(path("/test/hadoop/c"), paths[2].getPath()); This test will fail if the results are not in the specific order. Is this ordering (alphanumeric?) part of the contract? Can FileSystem return results from listStatus() in any order? Thanks, Noah
-
Re: FileSystem contract of listStatusEli Collins 2011-11-02, 23:25
Hey Noah,
HDFS returns items in lexographic order by byte (see INode#compareBytes) but I don't think ordering was intended to be an explicit part of the contract. Ie the test probably just needs to be modified to ignore the order. RawLocalFileSystem uses Java's File#list which has "no guarantee that the name strings in the resulting array will appear in any specific order; they are not, in particular, guaranteed to appear in alphabetical order.", however the FSContractBaseTest isn't run against local file systems which is why it probably never came up. Thanks, Eli On Wed, Nov 2, 2011 at 3:57 PM, Noah Watkins <[EMAIL PROTECTED]> wrote: > I have a question about the FileSystem contract in 0.20. > > In FileSystemContractBaseBaseTest:testFileStatus() there > are several files created, and afterwards the test confirms > that they are present. Here is the relevant code: > > FileStatus[] paths = fs.listStatus(path("/test")); > > paths = fs.listStatus(path("/test/hadoop")); > assertEquals(3, paths.length); > assertEquals(path("/test/hadoop/a"), paths[0].getPath()); > assertEquals(path("/test/hadoop/b"), paths[1].getPath()); > assertEquals(path("/test/hadoop/c"), paths[2].getPath()); > > This test will fail if the results are not in the specific > order. Is this ordering (alphanumeric?) part of the contract? > Can FileSystem return results from listStatus() in any order? > > Thanks, > Noah >
-
Re: FileSystem contract of listStatusNoah Watkins 2011-11-02, 23:57
----- Original Message -----
> From: "Eli Collins" <[EMAIL PROTECTED]> > > RawLocalFileSystem uses Java's File#list which has "no guarantee that > the name strings in the resulting array will appear in any specific > order; they are not, in particular, guaranteed to appear in > alphabetical order.", however the FSContractBaseTest isn't run against > local file systems which is why it probably never came up. Thanks Eli. We are cleaning up the unit tests for Ceph and the unit tests use an emulation layer built on top of the local FS. We ran into this ordering issue. Getting a fix to this would be nice. Thanks! -Noah
-
Re: FileSystem contract of listStatusTed Dunning 2011-11-03, 00:14
I think that the API docs actually say globStatus is ordered and leave the
ordering semantics for listStatus undefined. http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/fs/FileSystem.html#globStatus(org.apache.hadoop.fs.Path) http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/fs/RawLocalFileSystem.html#listStatus(org.apache.hadoop.fs.Path) On Wed, Nov 2, 2011 at 4:57 PM, Noah Watkins <[EMAIL PROTECTED]> wrote: > ----- Original Message ----- > > From: "Eli Collins" <[EMAIL PROTECTED]> > > > > RawLocalFileSystem uses Java's File#list which has "no guarantee that > > the name strings in the resulting array will appear in any specific > > order; they are not, in particular, guaranteed to appear in > > alphabetical order.", however the FSContractBaseTest isn't run against > > local file systems which is why it probably never came up. > > Thanks Eli. We are cleaning up the unit tests for Ceph and the unit tests > use an emulation layer built on top of the local FS. We ran into this > ordering > issue. Getting a fix to this would be nice. > > Thanks! > -Noah >
-
Re: FileSystem contract of listStatusHarsh J 2011-11-03, 02:20
Perhaps fixes against this can be covered as part of https://issues.apache.org/jira/browse/HADOOP-7659
On 03-Nov-2011, at 5:44 AM, Ted Dunning wrote: > I think that the API docs actually say globStatus is ordered and leave the > ordering semantics for listStatus undefined. > > http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/fs/FileSystem.html#globStatus(org.apache.hadoop.fs.Path) > > http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/fs/RawLocalFileSystem.html#listStatus(org.apache.hadoop.fs.Path) > > On Wed, Nov 2, 2011 at 4:57 PM, Noah Watkins <[EMAIL PROTECTED]> wrote: > >> ----- Original Message ----- >>> From: "Eli Collins" <[EMAIL PROTECTED]> >>> >>> RawLocalFileSystem uses Java's File#list which has "no guarantee that >>> the name strings in the resulting array will appear in any specific >>> order; they are not, in particular, guaranteed to appear in >>> alphabetical order.", however the FSContractBaseTest isn't run against >>> local file systems which is why it probably never came up. >> >> Thanks Eli. We are cleaning up the unit tests for Ceph and the unit tests >> use an emulation layer built on top of the local FS. We ran into this >> ordering >> issue. Getting a fix to this would be nice. >> >> Thanks! >> -Noah >>
-
Re: FileSystem contract of listStatusUma Maheswara Rao G 72686... 2011-11-03, 11:27
Yes, i remember this issue filed by Harsh recently.
GlobStatus will sort the results and return. May be we can fix for listStatus in the same way. Regards, Uma ----- Original Message ----- From: Harsh J <[EMAIL PROTECTED]> Date: Thursday, November 3, 2011 7:52 am Subject: Re: FileSystem contract of listStatus To: [EMAIL PROTECTED] > Perhaps fixes against this can be covered as part of > https://issues.apache.org/jira/browse/HADOOP-7659 > > On 03-Nov-2011, at 5:44 AM, Ted Dunning wrote: > > > I think that the API docs actually say globStatus is ordered and > leave the > > ordering semantics for listStatus undefined. > > > > > http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/fs/FileSystem.html#globStatus(org.apache.hadoop.fs.Path)> > > > http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/fs/RawLocalFileSystem.html#listStatus(org.apache.hadoop.fs.Path)> > > On Wed, Nov 2, 2011 at 4:57 PM, Noah Watkins > <[EMAIL PROTECTED]> wrote: > > > >> ----- Original Message ----- > >>> From: "Eli Collins" <[EMAIL PROTECTED]> > >>> > >>> RawLocalFileSystem uses Java's File#list which has "no > guarantee that > >>> the name strings in the resulting array will appear in any > specific>>> order; they are not, in particular, guaranteed to > appear in > >>> alphabetical order.", however the FSContractBaseTest isn't run > against>>> local file systems which is why it probably never came up. > >> > >> Thanks Eli. We are cleaning up the unit tests for Ceph and the > unit tests > >> use an emulation layer built on top of the local FS. We ran > into this > >> ordering > >> issue. Getting a fix to this would be nice. > >> > >> Thanks! > >> -Noah > >> > >
-
Re: FileSystem contract of listStatusM. C. Srivas 2011-11-05, 21:41
On Thu, Nov 3, 2011 at 4:27 AM, Uma Maheswara Rao G 72686 <
[EMAIL PROTECTED]> wrote: > Yes, i remember this issue filed by Harsh recently. > GlobStatus will sort the results and return. May be we can fix for > listStatus in the same way. > Not a good idea to sort needlessly. That's why we have globStatus() and listStatus() ... those who want a sorted list can use globStatus(). > > Regards, > Uma > ----- Original Message ----- > From: Harsh J <[EMAIL PROTECTED]> > Date: Thursday, November 3, 2011 7:52 am > Subject: Re: FileSystem contract of listStatus > To: [EMAIL PROTECTED] > > > Perhaps fixes against this can be covered as part of > > https://issues.apache.org/jira/browse/HADOOP-7659 > > > > On 03-Nov-2011, at 5:44 AM, Ted Dunning wrote: > > > > > I think that the API docs actually say globStatus is ordered and > > leave the > > > ordering semantics for listStatus undefined. > > > > > > > > > http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/fs/FileSystem.html#globStatus(org.apache.hadoop.fs.Path) > > > > > > > > http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/fs/RawLocalFileSystem.html#listStatus(org.apache.hadoop.fs.Path) > > > > > On Wed, Nov 2, 2011 at 4:57 PM, Noah Watkins > > <[EMAIL PROTECTED]> wrote: > > > > > >> ----- Original Message ----- > > >>> From: "Eli Collins" <[EMAIL PROTECTED]> > > >>> > > >>> RawLocalFileSystem uses Java's File#list which has "no > > guarantee that > > >>> the name strings in the resulting array will appear in any > > specific>>> order; they are not, in particular, guaranteed to > > appear in > > >>> alphabetical order.", however the FSContractBaseTest isn't run > > against>>> local file systems which is why it probably never came up. > > >> > > >> Thanks Eli. We are cleaning up the unit tests for Ceph and the > > unit tests > > >> use an emulation layer built on top of the local FS. We ran > > into this > > >> ordering > > >> issue. Getting a fix to this would be nice. > > >> > > >> Thanks! > > >> -Noah > > >> > > > > >
-
Re: FileSystem contract of listStatusUma Maheswara Rao G 72686... 2011-11-06, 02:51
----- Original Message -----
From: "M. C. Srivas" <[EMAIL PROTECTED]> Date: Sunday, November 6, 2011 3:13 am Subject: Re: FileSystem contract of listStatus To: [EMAIL PROTECTED] > On Thu, Nov 3, 2011 at 4:27 AM, Uma Maheswara Rao G 72686 < > [EMAIL PROTECTED]> wrote: > > > Yes, i remember this issue filed by Harsh recently. > > GlobStatus will sort the results and return. May be we can fix for > > listStatus in the same way. > > > > Not a good idea to sort needlessly. That's why we have > globStatus() and > listStatus() ... those who want a sorted list can use globStatus(). globStatus is for pattern matching and listStatus is for listing all the files in given directory. > > > > > > > Regards, > > Uma > > ----- Original Message ----- > > From: Harsh J <[EMAIL PROTECTED]> > > Date: Thursday, November 3, 2011 7:52 am > > Subject: Re: FileSystem contract of listStatus > > To: [EMAIL PROTECTED] > > > > > Perhaps fixes against this can be covered as part of > > > https://issues.apache.org/jira/browse/HADOOP-7659 > > > > > > On 03-Nov-2011, at 5:44 AM, Ted Dunning wrote: > > > > > > > I think that the API docs actually say globStatus is ordered and > > > leave the > > > > ordering semantics for listStatus undefined. > > > > > > > > > > > > > > http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/fs/FileSystem.html#globStatus(org.apache.hadoop.fs.Path)> > > > > > > > > > > > http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/fs/RawLocalFileSystem.html#listStatus(org.apache.hadoop.fs.Path)> > > > > > On Wed, Nov 2, 2011 at 4:57 PM, Noah Watkins > > > <[EMAIL PROTECTED]> wrote: > > > > > > > >> ----- Original Message ----- > > > >>> From: "Eli Collins" <[EMAIL PROTECTED]> > > > >>> > > > >>> RawLocalFileSystem uses Java's File#list which has "no > > > guarantee that > > > >>> the name strings in the resulting array will appear in any > > > specific>>> order; they are not, in particular, guaranteed to > > > appear in > > > >>> alphabetical order.", however the FSContractBaseTest isn't run > > > against>>> local file systems which is why it probably never > came up. > > > >> > > > >> Thanks Eli. We are cleaning up the unit tests for Ceph and the > > > unit tests > > > >> use an emulation layer built on top of the local FS. We ran > > > into this > > > >> ordering > > > >> issue. Getting a fix to this would be nice. > > > >> > > > >> Thanks! > > > >> -Noah > > > >> > > > > > > > > > Regards, Uma
-
Re: FileSystem contract of listStatusAlejandro Abdelnur 2011-11-07, 19:15
IMO sorting is something the FS shell should do, not the FileSystem.
Thanks. Alejandro On Sat, Nov 5, 2011 at 7:51 PM, Uma Maheswara Rao G 72686 <[EMAIL PROTECTED]> wrote: > ----- Original Message ----- > From: "M. C. Srivas" <[EMAIL PROTECTED]> > Date: Sunday, November 6, 2011 3:13 am > Subject: Re: FileSystem contract of listStatus > To: [EMAIL PROTECTED] > >> On Thu, Nov 3, 2011 at 4:27 AM, Uma Maheswara Rao G 72686 < >> [EMAIL PROTECTED]> wrote: >> >> > Yes, i remember this issue filed by Harsh recently. >> > GlobStatus will sort the results and return. May be we can fix for >> > listStatus in the same way. >> > >> >> Not a good idea to sort needlessly. That's why we have >> globStatus() and >> listStatus() ... those who want a sorted list can use globStatus(). > > globStatus is for pattern matching and listStatus is for listing all the files in given directory. > >> >> >> >> > >> > Regards, >> > Uma >> > ----- Original Message ----- >> > From: Harsh J <[EMAIL PROTECTED]> >> > Date: Thursday, November 3, 2011 7:52 am >> > Subject: Re: FileSystem contract of listStatus >> > To: [EMAIL PROTECTED] >> > >> > > Perhaps fixes against this can be covered as part of >> > > https://issues.apache.org/jira/browse/HADOOP-7659 >> > > >> > > On 03-Nov-2011, at 5:44 AM, Ted Dunning wrote: >> > > >> > > > I think that the API docs actually say globStatus is ordered and >> > > leave the >> > > > ordering semantics for listStatus undefined. >> > > > >> > > > >> > > >> > >> http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/fs/FileSystem.html#globStatus(org.apache.hadoop.fs.Path)> > >> > > > >> > > >> > >> http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/fs/RawLocalFileSystem.html#listStatus(org.apache.hadoop.fs.Path)> > >> > > > On Wed, Nov 2, 2011 at 4:57 PM, Noah Watkins >> > > <[EMAIL PROTECTED]> wrote: >> > > > >> > > >> ----- Original Message ----- >> > > >>> From: "Eli Collins" <[EMAIL PROTECTED]> >> > > >>> >> > > >>> RawLocalFileSystem uses Java's File#list which has "no >> > > guarantee that >> > > >>> the name strings in the resulting array will appear in any >> > > specific>>> order; they are not, in particular, guaranteed to >> > > appear in >> > > >>> alphabetical order.", however the FSContractBaseTest isn't run >> > > against>>> local file systems which is why it probably never >> came up. >> > > >> >> > > >> Thanks Eli. We are cleaning up the unit tests for Ceph and the >> > > unit tests >> > > >> use an emulation layer built on top of the local FS. We ran >> > > into this >> > > >> ordering >> > > >> issue. Getting a fix to this would be nice. >> > > >> >> > > >> Thanks! >> > > >> -Noah >> > > >> >> > > >> > > >> > >> > Regards, > Uma > |