Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> do HDFS files starting with _ (underscore) have special properties?


Copy link to this message
-
Re: do HDFS files starting with _ (underscore) have special properties?
On Fri, Sep 2, 2011 at 4:04 PM, Meng Mao <[EMAIL PROTECTED]> wrote:

> We have a compression utility that tries to grab all subdirs to a directory
> on HDFS. It makes a call like this:
> FileStatus[] subdirs = fs.globStatus(new Path(inputdir, "*"));
>
> and handles files vs dirs accordingly.
>
> We tried to run our utility against a dir containing a computed SOLR shard,
> which has files that look like this:
> -rw-r--r--   2 hadoopuser visible 8538430603 2011-09-01 18:58
> /test/output/solr-20110901165238/part-00000/data/index/_ox.fdt
> -rw-r--r--   2 hadoopuser visible  233396596 2011-09-01 18:57
> /test/output/solr-20110901165238/part-00000/data/index/_ox.fdx
> -rw-r--r--   2 hadoopuser visible        130 2011-09-01 18:57
> /test/output/solr-20110901165238/part-00000/data/index/_ox.fnm
> -rw-r--r--   2 hadoopuser visible 2147948283 2011-09-01 18:55
> /test/output/solr-20110901165238/part-00000/data/index/_ox.frq
> -rw-r--r--   2 hadoopuser visible   87523726 2011-09-01 18:57
> /test/output/solr-20110901165238/part-00000/data/index/_ox.nrm
> -rw-r--r--   2 hadoopuser visible  920936168 2011-09-01 18:57
> /test/output/solr-20110901165238/part-00000/data/index/_ox.prx
> -rw-r--r--   2 hadoopuser visible   22619542 2011-09-01 18:58
> /test/output/solr-20110901165238/part-00000/data/index/_ox.tii
> -rw-r--r--   2 hadoopuser visible 2070214402 2011-09-01 18:51
> /test/output/solr-20110901165238/part-00000/data/index/_ox.tis
> -rw-r--r--   2 hadoopuser visible         20 2011-09-01 18:51
> /test/output/solr-20110901165238/part-00000/data/index/segments.gen
> -rw-r--r--   2 hadoopuser visible        282 2011-09-01 18:55
> /test/output/solr-20110901165238/part-00000/data/index/segments_2
>
>
> The globStatus call seems only able to pick up those last 2 files; the
> several files that start with _ don't register.
>
> I've skimmed the FileSystem and GlobExpander source to see if there's
> anything related to this, but didn't see it. Google didn't turn up anything
> about underscores. Am I misunderstanding something about the regex patterns
> needed to pick these up or unaware of some filename convention in HDFS?
>

Files starting with '_' are considered 'hidden' like unix files starting
with '.'. I did not know that for a very long time because not everyone
follows this rule or even knows about it.