Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # general >> rack awareness help


Copy link to this message
-
Re: rack awareness help
More like the following (shown with the bash prompt). You could type
this for testing. However, ultimately, hadoop itself will actually be
executing this script and reading its output.

$ ./script.sh server1.mydomain
rack1
$ ./script.sh server2.mydomain
rack1
$ ./script.sh server3.mydomain
rack2
$ ./script.sh server4.mydomain
rack2

On Fri, Mar 19, 2010 at 7:32 AM, Mag Gam <[EMAIL PROTECTED]> wrote:
> Thanks everyone. I think everyone can agree that this part of the
> documentation is lacking for hadoop.
>
> Can someone please provide be a use case, for example:
>
> #server 1
> Input > script.sh
> Output > rack01
>
> #server 2
> Input > script.sh
> Output > rack02
>
>
> Is this how its supposed to work? I am bad with bash so I am trying to
> understand the logic so I can implement it with another language such
> as tcl
>
>
> On Fri, Mar 19, 2010 at 1:00 AM, Christopher Tubbs <[EMAIL PROTECTED]> wrote:
>> You only specify the script on the namenode.
>> So, you could do something like:
>>
>> #!/bin/bash
>> #rack_decider.sh
>>
>> if [ $1 = "server1.mydomain" -o $1 = "192.168.0.1" ] ; then
>>  echo rack1
>> elif [ $1 = "server2.mydomain" -o $1 = "192.168.0.2" ] ; then
>>  echo rack1
>> elif [ $1 = "server3.mydomain" -o $1 = "192.168.0.3" ] ; then
>>  echo rack2
>> elif [ $1 = "server4.mydomain" -o $1 = "192.168.0.4" ] ; then
>>  echo rack2
>> else
>>  echo unknown_rack
>> fi
>> # EOF
>>
>> Of course, this is by far the most basic script you could have (I'm
>> not sure why it wasn't offered as an example instead of a more
>> complicated one).
>>
>> On Thu, Mar 18, 2010 at 8:41 PM, Mag Gam <[EMAIL PROTECTED]> wrote:
>>> Chris:
>>>
>>> This clears up my questions a lot! Thankyou.
>>>
>>> So, if I have 4 data servers and I want 2 racks. I can do this
>>>
>>> #!/bin/bash
>>> #rack1.sh
>>> echo rack1
>>>
>>> #bin/bash
>>> #rack2.sh
>>> echo rack2
>>>
>>>
>>> So, I can do this for 2 servers
>>>
>>>
>>> <property>
>>>  <name>topology.script.file.name</name>
>>>  <value>rack1.sh</value>
>>> </property>
>>>
>>> And for the other 2 servers, I can do this:
>>>
>>>
>>> <property>
>>>  <name>topology.script.file.name</name>
>>>  <value>rack2.sh</value>
>>> </property>
>>>
>>>
>>> correct?
>>>
>>>
>>> On Thu, Mar 18, 2010 at 3:15 AM, Christopher Tubbs <[EMAIL PROTECTED]> wrote:
>>>> Hadoop will identify data nodes in your cluster by name and execute
>>>> your script with the data node as an argument. The expected output of
>>>> your script is the name of the rack on which it is located.
>>>>
>>>> The script you referenced takes the node name as an argument ($1), and
>>>> crawls through a separate file looking up that node in the left
>>>> column, and printing the value in the second column if it finds it.
>>>>
>>>> If you were to use this script, you would just create the topology
>>>> file that lists all your nodes by name/ip on the left and the rack
>>>> they are in on the right.
>>>>
>>>> On Wed, Mar 17, 2010 at 11:34 PM, Mag Gam <[EMAIL PROTECTED]> wrote:
>>>>> Well,  I didn't really solve the problem. Now I have even more questions.
>>>>>
>>>>> I came across this script,
>>>>> http://wiki.apache.org/hadoop/topology_rack_awareness_scripts
>>>>>
>>>>> but it makes no sense to me! Can someone please try to explain what
>>>>> its trying to do?
>>>>>
>>>>>
>>>>> MikeThomas:
>>>>>
>>>>> Your script isn't working for me. I think there are some syntax
>>>>> errors. Is this how its supposed to look: http://pastebin.ca/1844287
>>>>>
>>>>> thanks
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Mar 4, 2010 at 10:30 PM, Jeff Hammerbacher <[EMAIL PROTECTED]> wrote:
>>>>>> Hey Mag,
>>>>>>
>>>>>> Glad you have solved the problem. I've created a JIRA ticket to improve the
>>>>>> existing documentation: https://issues.apache.org/jira/browse/HADOOP-6616.
>>>>>> If you have some time, it would be useful to hear what could be added to the
>>>>>> existing documentation that would have helped you figure this out sooner.
>>>>>>
>>>>>> Thanks,
>>>>>> Jeff
>>>>>>
>>>>>> On Thu, Mar 4, 2010 at 3:39 PM, Mag Gam <[EMAIL PROTECTED]> wrote:
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB