Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> do HDFS files starting with _ (underscore) have special properties?


Copy link to this message
-
Re: do HDFS files starting with _ (underscore) have special properties?
On Fri, Sep 2, 2011 at 4:04 PM, Meng Mao <[EMAIL PROTECTED]> wrote:

> We have a compression utility that tries to grab all subdirs to a directory
> on HDFS. It makes a call like this:
> FileStatus[] subdirs = fs.globStatus(new Path(inputdir, "*"));
>
> and handles files vs dirs accordingly.
>
> We tried to run our utility against a dir containing a computed SOLR shard,
> which has files that look like this:
> -rw-r--r--   2 hadoopuser visible 8538430603 2011-09-01 18:58
> /test/output/solr-20110901165238/part-00000/data/index/_ox.fdt
> -rw-r--r--   2 hadoopuser visible  233396596 2011-09-01 18:57
> /test/output/solr-20110901165238/part-00000/data/index/_ox.fdx
> -rw-r--r--   2 hadoopuser visible        130 2011-09-01 18:57
> /test/output/solr-20110901165238/part-00000/data/index/_ox.fnm
> -rw-r--r--   2 hadoopuser visible 2147948283 2011-09-01 18:55
> /test/output/solr-20110901165238/part-00000/data/index/_ox.frq
> -rw-r--r--   2 hadoopuser visible   87523726 2011-09-01 18:57
> /test/output/solr-20110901165238/part-00000/data/index/_ox.nrm
> -rw-r--r--   2 hadoopuser visible  920936168 2011-09-01 18:57
> /test/output/solr-20110901165238/part-00000/data/index/_ox.prx
> -rw-r--r--   2 hadoopuser visible   22619542 2011-09-01 18:58
> /test/output/solr-20110901165238/part-00000/data/index/_ox.tii
> -rw-r--r--   2 hadoopuser visible 2070214402 2011-09-01 18:51
> /test/output/solr-20110901165238/part-00000/data/index/_ox.tis
> -rw-r--r--   2 hadoopuser visible         20 2011-09-01 18:51
> /test/output/solr-20110901165238/part-00000/data/index/segments.gen
> -rw-r--r--   2 hadoopuser visible        282 2011-09-01 18:55
> /test/output/solr-20110901165238/part-00000/data/index/segments_2
>
>
> The globStatus call seems only able to pick up those last 2 files; the
> several files that start with _ don't register.
>
> I've skimmed the FileSystem and GlobExpander source to see if there's
> anything related to this, but didn't see it. Google didn't turn up anything
> about underscores. Am I misunderstanding something about the regex patterns
> needed to pick these up or unaware of some filename convention in HDFS?
>

Files starting with '_' are considered 'hidden' like unix files starting
with '.'. I did not know that for a very long time because not everyone
follows this rule or even knows about it.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB