Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # general >> rack awareness help


Copy link to this message
-
Re: rack awareness help
You only specify the script on the namenode.
So, you could do something like:

#!/bin/bash
#rack_decider.sh

if [ $1 = "server1.mydomain" -o $1 = "192.168.0.1" ] ; then
  echo rack1
elif [ $1 = "server2.mydomain" -o $1 = "192.168.0.2" ] ; then
  echo rack1
elif [ $1 = "server3.mydomain" -o $1 = "192.168.0.3" ] ; then
  echo rack2
elif [ $1 = "server4.mydomain" -o $1 = "192.168.0.4" ] ; then
  echo rack2
else
  echo unknown_rack
fi
# EOF

Of course, this is by far the most basic script you could have (I'm
not sure why it wasn't offered as an example instead of a more
complicated one).

On Thu, Mar 18, 2010 at 8:41 PM, Mag Gam <[EMAIL PROTECTED]> wrote:
> Chris:
>
> This clears up my questions a lot! Thankyou.
>
> So, if I have 4 data servers and I want 2 racks. I can do this
>
> #!/bin/bash
> #rack1.sh
> echo rack1
>
> #bin/bash
> #rack2.sh
> echo rack2
>
>
> So, I can do this for 2 servers
>
>
> <property>
>  <name>topology.script.file.name</name>
>  <value>rack1.sh</value>
> </property>
>
> And for the other 2 servers, I can do this:
>
>
> <property>
>  <name>topology.script.file.name</name>
>  <value>rack2.sh</value>
> </property>
>
>
> correct?
>
>
> On Thu, Mar 18, 2010 at 3:15 AM, Christopher Tubbs <[EMAIL PROTECTED]> wrote:
>> Hadoop will identify data nodes in your cluster by name and execute
>> your script with the data node as an argument. The expected output of
>> your script is the name of the rack on which it is located.
>>
>> The script you referenced takes the node name as an argument ($1), and
>> crawls through a separate file looking up that node in the left
>> column, and printing the value in the second column if it finds it.
>>
>> If you were to use this script, you would just create the topology
>> file that lists all your nodes by name/ip on the left and the rack
>> they are in on the right.
>>
>> On Wed, Mar 17, 2010 at 11:34 PM, Mag Gam <[EMAIL PROTECTED]> wrote:
>>> Well,  I didn't really solve the problem. Now I have even more questions.
>>>
>>> I came across this script,
>>> http://wiki.apache.org/hadoop/topology_rack_awareness_scripts
>>>
>>> but it makes no sense to me! Can someone please try to explain what
>>> its trying to do?
>>>
>>>
>>> MikeThomas:
>>>
>>> Your script isn't working for me. I think there are some syntax
>>> errors. Is this how its supposed to look: http://pastebin.ca/1844287
>>>
>>> thanks
>>>
>>>
>>>
>>> On Thu, Mar 4, 2010 at 10:30 PM, Jeff Hammerbacher <[EMAIL PROTECTED]> wrote:
>>>> Hey Mag,
>>>>
>>>> Glad you have solved the problem. I've created a JIRA ticket to improve the
>>>> existing documentation: https://issues.apache.org/jira/browse/HADOOP-6616.
>>>> If you have some time, it would be useful to hear what could be added to the
>>>> existing documentation that would have helped you figure this out sooner.
>>>>
>>>> Thanks,
>>>> Jeff
>>>>
>>>> On Thu, Mar 4, 2010 at 3:39 PM, Mag Gam <[EMAIL PROTECTED]> wrote:
>>>>
>>>>> Thanks everyone for explaining this to me instead of giving me RTFM!
>>>>>
>>>>> I will play around with it and see how far I get.
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Mar 4, 2010 at 9:21 AM, Steve Loughran <[EMAIL PROTECTED]> wrote:
>>>>> > Allen Wittenauer wrote:
>>>>> >>
>>>>> >> On 3/3/10 5:01 PM, "Mag Gam" <[EMAIL PROTECTED]> wrote:
>>>>> >>
>>>>> >>> Thanks Alan! Your presentation is very nice!
>>>>> >>
>>>>> >> Thanks. :)
>>>>> >>
>>>>> >>> "If you don't provide a script for rack awareness, it treats every
>>>>> >>> node as if it was its own rack". I am using the default settings and
>>>>> >>> the report still says only 1 rack.
>>>>> >>
>>>>> >> Let's take a different approach to convince you. :)
>>>>> >>
>>>>> >> Think about the question:  Is there a difference between all nodes in
>>>>> one
>>>>> >> rack vs. every node acting as a lone rack?
>>>>> >>
>>>>> >> The answer is no, there isn't any difference.  In both cases, all copies
>>>>> >> of
>>>>> >> the blocks can go to pretty much any node. When a MR job runs, every
>>>>> node
>>>>> >> would either be considered 'off rack' or 'rack-local'.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB