Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Remote Java client connection into EC2 instance

Copy link to this message
Re: Remote Java client connection into EC2 instance
The IP addresses assigned on the cluster are all internal ones, so when the regionservers do a reverse lookup, they get something foo.internal. Then they report this to the master, which hands them out to the client library as region locations. So while you can telnet to 60020 on the slaves as you know the public DNS names, the client library is only able to learn of the internal ones.

Some options:

1) Run your clients up in the EC2 cloud also

2) Use a connector like Stargate or the Thrift server which can in effect proxy your requests to the EC2 hosted cluster.

3) Grab the latest scripts from 0.20 branch in SVN. In $HOME/.hbase-<cluster>-instances will be the list of instance identifiers of the slaves. Do:

   ec2-describe-instances `cat ~/.hbase-<cluster>-instances` | grep INSTANCE | grep running | awk '{print "$4 $5"}'

This will give you a mapping between private and public names. Dump entries into your /etc/hosts which map public IP (use dig to look up) to private name. Yes, it's not a nice hack.

4) You can use SSH as a SOCKS 5 proxy (ssh -f -N -D <local-port> <remote>), which will also forward DNS requests, but to do it that way you'd have to hack the client library some.

   - Andy

> From: George Stathis
> Subject: Remote Java client connection into EC2 instance
> Date: Friday, March 19, 2010, 8:00 AM
> This has come up
> before<http://mail-archives.apache.org/mod_mbox/hadoop-hbase-user/200909.mbox/%[EMAIL PROTECTED]%3E>but
> I'm still unclear as to whether this is possible or not:
> remotely connecting to an EC2 instance using the Java client
> library.
> Now, I have gone though a lot of threads and posts and have
> opened up all required ports (I think) on EC2: 60000, 60020
> and 2181 (I can telnet into them). I have one test EC2
> instance running in pseudo-distributed mode to
> test the remote connection. I attempt to run a single unit
> test.