Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> Block size in HDFS


+
Pedro Costa 2011-06-10, 15:05
+
Marcos Ortiz 2011-06-10, 16:01
+
Pedro Costa 2011-06-10, 15:42
+
Philip Zeyliger 2011-06-10, 16:00
+
Pedro Costa 2011-06-10, 16:08
Copy link to this message
-
Re: Block size in HDFS
On Fri, Jun 10, 2011 at 9:08 AM, Pedro Costa <[EMAIL PROTECTED]> wrote:
> This means that, when HDFS reads 1KB file from the disk, he will put
> the data in blocks of 64MB?

No.

>
> On Fri, Jun 10, 2011 at 5:00 PM, Philip Zeyliger <[EMAIL PROTECTED]> wrote:
>> On Fri, Jun 10, 2011 at 8:42 AM, Pedro Costa <[EMAIL PROTECTED]> wrote:
>>> But, how can I say that a 1KB file will only use 1KB of disc space, if
>>> a block is configured has 64MB? In my view, if a 1KB use a block of
>>> 64MB, the file will occupy 64MB in the disc.
>>
>> A block of HDFS is the unit of distribution and replication, not the
>> unit of storage.  HDFS uses the underlying file systems for physical
>> storage.
>>
>> -- Philip
>>
>>>
>>> How can you disassociate a  64MB data block from HDFS of a disk block?
>>>
>>> On Fri, Jun 10, 2011 at 5:01 PM, Marcos Ortiz <[EMAIL PROTECTED]> wrote:
>>>> On 06/10/2011 10:35 AM, Pedro Costa wrote:
>>>>
>>>> Hi,
>>>>
>>>> If I define HDFS to use blocks of 64 MB, and I store in HDFS a 1KB
>>>> file, this file will ocupy 64MB in the HDFS?
>>>>
>>>> Thanks,
>>>>
>>>> HDFS is not very efficient storing small files, because each file is stored
>>>> in a block (of 64 MB in your case), and the block metadata
>>>> is held in memory by the NN. But you should know that this 1KB file only
>>>> will use 1KB of disc space.
>>>>
>>>> For small files, you can use Hadoop archives.
>>>> Regards
>>>>
>>>> --
>>>> Marcos Luís Ortíz Valmaseda
>>>>  Software Engineer (UCI)
>>>>  http://marcosluis2186.posterous.com
>>>>  http://twitter.com/marcosluis2186
>>>>
>>>>
>>>
>>
>
>
>
> --
> ---------------------------
> Pedro Sá da Costa
>
> @: [EMAIL PROTECTED]
> @: [EMAIL PROTECTED]
>
+
Pedro Costa 2011-06-10, 16:47
+
John George 2011-06-10, 17:16
+
Josh Patterson 2011-06-10, 19:34
+
Allen Wittenauer 2011-06-13, 17:05
+
Jain, Prem 2011-06-13, 20:37
+
Allen Wittenauer 2011-06-13, 21:08
+
Matthew Foley 2011-06-10, 19:05
+
sridhar basam 2011-06-10, 15:43
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB