Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> OK to run data node on same machine as secondary name node?

Copy link to this message
Re: OK to run data node on same machine as secondary name node?
Very helpful info.  I hadn't considered the bandwidth aspect of it.
Thanks much, Harsh!


On 08/16/2012 12:58 AM, Harsh J wrote:
> I'd not do this if the fsimage size is greater than, say, 5-6 GB. The
> SNN pulls and then pushes this back from the NameNode and the transfer
> can get heavy. If you have
> https://issues.apache.org/jira/browse/HDFS-1457 (image transfer
> throttler) in the version of Hadoop you use, you can set it to a
> proper value and keep the SNN on a slave node without worrying about
> it hogging all the available bandwidth.
> On Thu, Aug 16, 2012 at 3:41 AM, David Rosenstrauch <[EMAIL PROTECTED]> wrote:
>> I have a Hadoop cluster that's a little tight on resources.  I was thinking
>> one way I could solve this could be by running an additional data node on
>> the same machine as the secondary name node.
>> I wouldn't dare do that on the primary name node, since that machine needs
>> to be extremely performant.  But since all the secondary name node does is
>> doing a merge of the name node's checkpoint and logs, which is not an
>> activity that require top-notch real-time performance, I thought it might
>> not be a problem if I were to set up a data node running there as well.
>> Any reasons why that might be a bad idea?
>> Thanks,
>> DR