Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Block size in HDFS


Copy link to this message
-
Re: Block size in HDFS
On Fri, Jun 10, 2011 at 9:08 AM, Pedro Costa <[EMAIL PROTECTED]> wrote:
> This means that, when HDFS reads 1KB file from the disk, he will put
> the data in blocks of 64MB?

No.

>
> On Fri, Jun 10, 2011 at 5:00 PM, Philip Zeyliger <[EMAIL PROTECTED]> wrote:
>> On Fri, Jun 10, 2011 at 8:42 AM, Pedro Costa <[EMAIL PROTECTED]> wrote:
>>> But, how can I say that a 1KB file will only use 1KB of disc space, if
>>> a block is configured has 64MB? In my view, if a 1KB use a block of
>>> 64MB, the file will occupy 64MB in the disc.
>>
>> A block of HDFS is the unit of distribution and replication, not the
>> unit of storage.  HDFS uses the underlying file systems for physical
>> storage.
>>
>> -- Philip
>>
>>>
>>> How can you disassociate a  64MB data block from HDFS of a disk block?
>>>
>>> On Fri, Jun 10, 2011 at 5:01 PM, Marcos Ortiz <[EMAIL PROTECTED]> wrote:
>>>> On 06/10/2011 10:35 AM, Pedro Costa wrote:
>>>>
>>>> Hi,
>>>>
>>>> If I define HDFS to use blocks of 64 MB, and I store in HDFS a 1KB
>>>> file, this file will ocupy 64MB in the HDFS?
>>>>
>>>> Thanks,
>>>>
>>>> HDFS is not very efficient storing small files, because each file is stored
>>>> in a block (of 64 MB in your case), and the block metadata
>>>> is held in memory by the NN. But you should know that this 1KB file only
>>>> will use 1KB of disc space.
>>>>
>>>> For small files, you can use Hadoop archives.
>>>> Regards
>>>>
>>>> --
>>>> Marcos Luís Ortíz Valmaseda
>>>>  Software Engineer (UCI)
>>>>  http://marcosluis2186.posterous.com
>>>>  http://twitter.com/marcosluis2186
>>>>
>>>>
>>>
>>
>
>
>
> --
> ---------------------------
> Pedro Sá da Costa
>
> @: [EMAIL PROTECTED]
> @: [EMAIL PROTECTED]
>