Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS, mail # user - Block size in HDFS


+
Pedro Costa 2011-06-10, 15:05
+
Marcos Ortiz 2011-06-10, 16:01
+
Pedro Costa 2011-06-10, 15:42
Copy link to this message
-
Re: Block size in HDFS
Philip Zeyliger 2011-06-10, 16:00
On Fri, Jun 10, 2011 at 8:42 AM, Pedro Costa <[EMAIL PROTECTED]> wrote:
> But, how can I say that a 1KB file will only use 1KB of disc space, if
> a block is configured has 64MB? In my view, if a 1KB use a block of
> 64MB, the file will occupy 64MB in the disc.

A block of HDFS is the unit of distribution and replication, not the
unit of storage.  HDFS uses the underlying file systems for physical
storage.

-- Philip

>
> How can you disassociate a  64MB data block from HDFS of a disk block?
>
> On Fri, Jun 10, 2011 at 5:01 PM, Marcos Ortiz <[EMAIL PROTECTED]> wrote:
>> On 06/10/2011 10:35 AM, Pedro Costa wrote:
>>
>> Hi,
>>
>> If I define HDFS to use blocks of 64 MB, and I store in HDFS a 1KB
>> file, this file will ocupy 64MB in the HDFS?
>>
>> Thanks,
>>
>> HDFS is not very efficient storing small files, because each file is stored
>> in a block (of 64 MB in your case), and the block metadata
>> is held in memory by the NN. But you should know that this 1KB file only
>> will use 1KB of disc space.
>>
>> For small files, you can use Hadoop archives.
>> Regards
>>
>> --
>> Marcos Luís Ortíz Valmaseda
>>  Software Engineer (UCI)
>>  http://marcosluis2186.posterous.com
>>  http://twitter.com/marcosluis2186
>>
>>
>
+
Pedro Costa 2011-06-10, 16:08
+
Philip Zeyliger 2011-06-10, 16:13
+
Pedro Costa 2011-06-10, 16:47
+
John George 2011-06-10, 17:16
+
Josh Patterson 2011-06-10, 19:34
+
Allen Wittenauer 2011-06-13, 17:05
+
Jain, Prem 2011-06-13, 20:37
+
Allen Wittenauer 2011-06-13, 21:08
+
Matthew Foley 2011-06-10, 19:05
+
sridhar basam 2011-06-10, 15:43