Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Re: Maximum Storage size in a Single datanode


Copy link to this message
-
Re: Maximum Storage size in a Single datanode
Hi,

Also, think about the memory you will need in your DataNode to serve
all this data... I'm not sure there is any server which can take that
today. You need a certain amount of memory per block in the DN. With
all this data, you will have SOOOO many blocks...

Regarding RH vs Ubuntu, I think Ubuntu is more an end user
distribution than a server one. And I found RH a bit "not enought
free". I have installed Debian on all my servers.

JM

2013/1/30, Vijay Thakorlal <[EMAIL PROTECTED]>:
> Jeba,
>
>
>
> I'm not aware of any hadoop limitations in this respect (others may be able
> to comment on this); since blocks are just files on the OS, the datanode
> will create subdirectories to store blocks to avoid problems with large
> numbers of files in a single directory. So I would think the limitations
> are
> primarily around the type of file system you select, for ext3 it
> theoretically supports up to 16TB (http://en.wikipedia.org/wiki/Ext3) and
> for ext4 up to 1EB (http://en.wikipedia.org/wiki/Ext4). Although you're
> probably already planning on deploying 64-bit servers, I believe for large
> FS on ext4 you'd be better off with a 64-bit server.
>
>
>
> As far as OS is concerned anecdotally (based on blogs, hadoop mailing lists
> etc) I believe there are more production deployments using RHEL and/or
> CentOS than Ubuntu.
>
>
>
> It's probably not practical to have nodes with 1PB of data for the reasons
> that others have mentioned and due to the replication traffic that will be
> generated if the node dies. Not to mention fsck times with large file
> systems.
>
>
>
> Vijay
>
>
>
>
>
>
>
> From: jeba earnest [mailto:[EMAIL PROTECTED]]
> Sent: 30 January 2013 10:40
> To: [EMAIL PROTECTED]
> Subject: Re: Maximum Storage size in a Single datanode
>
>
>
>
>
> I want to use either UBUNTU or REDHAT .
>
> I just want to know how much storage space we can allocate in a single data
> node.
>
>
>
> Is there any limitations in hadoop for storage in single node?
>
>
>
>
>
>
>
> Regards,
>
> Jeba
>
>   _____
>
> From: "Pamecha, Abhishek" <[EMAIL PROTECTED]>
> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; jeba earnest
> <[EMAIL PROTECTED]>
> Sent: Wednesday, 30 January 2013 2:45 PM
> Subject: Re: Maximum Storage size in a Single datanode
>
>
>
> What would be the reason you would do that?
>
>
>
> You would want to leverage distributed dataset for higher availability and
> better response times.
>
>
>
> The maximum storage depends completely on the disks  capacity of your nodes
> and what your OS supports. Typically I have heard of about 1-2 TB/node to
> start with, but I may be wrong.
>
> -abhishek
>
>
>
>
>
> From: jeba earnest <[EMAIL PROTECTED]>
> Reply-To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>, jeba earnest
> <[EMAIL PROTECTED]>
> Date: Wednesday, January 30, 2013 1:38 PM
> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> Subject: Maximum Storage size in a Single datanode
>
>
>
>
>
> Hi,
>
>
>
> Is it possible to keep 1 Petabyte in a single data node?
>
> If not, How much is the maximum storage for a particular data node?
>
>
>
> Regards,
> M. Jeba
>
>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB