Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Any possible to set hdfs block size to a value smaller than 64MB?


Copy link to this message
-
Re: Any possible to set hdfs block size to a value smaller than 64MB?
Hi Brian,

Interesting observations.
This is probably in line with the "client side mount table" approach, [soon to be] proposed in HDFS-1053.
Another way to provide personalized view of the file system would be to use symbolic links, which is
available now in 0.21/22.
For 4K files I would probably use h-archives, especially if the data, as you describe it, is not changing.
But high-energy physicist should be considering using HBase over HDFS.

--konst
On 5/18/2010 11:57 AM, Brian Bockelman wrote:
>
> Hey Konstantin,
>
> Interesting paper :)
>
> One thing which I've been kicking around lately is "at what scale does the file/directory paradigm break down?"
>
> At some point, I think the human mind can no longer comprehend so many files (certainly, I can barely organize the few thousand files on my laptop).  Thus, file growth comes from (a) having lots of humans use a single file system or (b) automated programs generate the files.  For (a), you don't need a central global namespace, you just need the ability to have a "local" namespace per-person that can be shared among friends.  For (b), a program isn't going to be upset if you replace a file system with a database / dataset object / bucket.
>
> Two examples:
> - structural biology: I've seen a lot of different analysis workflows (such as autodock) that compares a protein against a "database" of ligands, where the database is 80,000 O(4KB) files.  Each file represents a known ligand that the biologist might come back and examine if it is relevant to the study of their protein.
> - high-energy physics: Each detector can produce millions of a events a night, and experiments will produce many billions of events.  These are saved into files (each file containing hundreds or thousands of events); these files are kept in collections (called datasets, data blocks, lumi sections, or runs, depending on what you're doing).  Tasks are run against datasets; they will output smaller datasets which the physicists will iterate upon until they get some dataset which fits onto their laptop.
> Here's my Claim: The biologists have a small enough number of objects to manage each one as a separate file; they do this because it's easier for humans navigating around in a terminal.  The physicists have such a huge number of objects that there's no way to manage them using one file per object, so they utilize files only as a mechanism to serialize bytes of data and have higher-order data structures for management
> Here's my Question: at what point do you move from the biologist's model (named objects, managed independently, single files) to the physicist's model (anonymous objects, managed in large groups, files are only used because we save data on file systems)?
>
> Another way to look at this is to consider DNS.  DNS maintains the namespace of the globe, but appears to do this just fine without a single central catalog.  If you start with a POSIX filesystem namespace (and the guarantees it implies), what rules must you relax in order to arrive at DNS?  On the scale of managing million (billion? ten billion? trillion?) files, are any of the assumptions relevant?
>
> I don't know the answers to these questions, but I suspect they become important over the next 10 years.
>
> Brian
>
> PS - I starting thinking along these lines during MSST when the LLNL guy was speculating about what it meant to "fsck" a file system with 1 trillion files.
>
> On May 18, 2010, at 12:56 PM, Konstantin Shvachko wrote:
>
>> You can also get some performance numbers and answers to the block size dilemma problem here:
>>
>> http://developer.yahoo.net/blogs/hadoop/2010/05/scalability_of_the_hadoop_dist.html
>>
>> I remember some people were using Hadoop for storing or streaming videos.
>> Don't know how well that worked.
>> It would be interesting to learn about your experience.
>>
>> Thanks,
>> --Konstantin
>>
>>
>> On 5/18/2010 8:41 AM, Brian Bockelman wrote:
>>> Hey Hassan,
>>>
>>> 1) The overhead is pretty small, measured in a small number of milliseconds on average