Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Host NameNode, DataNode, JobTracker or TaskTracker on the same machine


+
Jeff LI 2013-02-14, 17:10
+
shashwat shriparv 2013-02-14, 17:29
+
Tariq 2013-02-14, 17:41
+
Jeff LI 2013-02-14, 18:05
Copy link to this message
-
Re: Host NameNode, DataNode, JobTracker or TaskTracker on the same machine
With the current configuration you are safe. But as your data grows you
will start consuming more space and eventually you might end with
insufficient space to hold the metadata itself as it is also getting stored
in the same disk. Also, bigger data means more no of files and blocks which
means more no of object which in turn means greater memory consumption. And
don't forget about the resource consumption of your processing layer. Like
disk space required to store the intermediate output files, resources
required to initiate map and reduce tasks etc.

But it all depends upon the size of your data and the intensity of
processing you are going to perform. As of now you look good to me with
128TB+64GB.

HTH

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com
On Thu, Feb 14, 2013 at 11:35 PM, Jeff LI <[EMAIL PROTECTED]> wrote:

> Thanks for your response.  I'm running SNN on another machine.
>
> Could you explain a bit more on why I may run out of memory or disk?
>
> I understand that NameNode holds file system metadata in memory.  I found
> through this post that (
> http://developer.yahoo.com/blogs/hadoop/posts/2010/05/scalability_of_the_hadoop_dist/
> )
> as a rule of thumb,
> 1 GB metadata ≈ 1 PB physical storage
>
> Currently, my cluster has about 128TB of disk storage in total and 64GB
> memory on each machine.  Does this suggests that I'm protected against
> running out of memory from metadata?
>
> Thanks
>
> Cheers
>
> Jeff
>
>
> On Thu, Feb 14, 2013 at 12:41 PM, Tariq <[EMAIL PROTECTED]> wrote:
>
>> You may run out of memory,out of disk. If SNN is also running on the same
>> machine then you are totally screwed in case of any breakdown
>>
>> shashwat shriparv <[EMAIL PROTECTED]> wrote:
>>
>> >If you are doing it for production all the process should be running on
>> >seperate machine as it will decrease the overload of the machine.
>> >
>> >
>> >
>> >∞
>> >Shashwat Shriparv
>> >
>> >
>> >
>> >On Thu, Feb 14, 2013 at 10:40 PM, Jeff LI <[EMAIL PROTECTED]> wrote:
>> >
>> >> Hello,
>> >>
>> >> Is there a good reason that we should not host NameNode, DataNode,
>> >> JobTracker or TaskTracker services on the same machine?
>> >>
>> >> Not doing so is suggested here
>> >http://wiki.apache.org/hadoop/NameNode,
>> >> but I'd like to know the reasoning of this.
>> >>
>> >> Thanks
>> >>
>> >> Cheers
>> >>
>> >> Jeff
>> >>
>> >>
>>
>> --
>> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB