Hi All, I have around 20-30 geographically distant clients that need to write data to a centralized HBase server. I have dedicated VPN for the communication and hence bandwidth won't be a issue. is it a good idea to make the clients directly send data to the centralized server?. Or a geographically distributed Hadoop cluster is more efficient for this scenario? Has anybody come across such use case?. Need suggestions on the viability and efficiency of the setup to follow
I mean directly interacting with the remote zookeeper. Say I'm able to access the zk server and hbase server externally. On 3 April 2014 18:54, Ted Yu <[EMAIL PROTECTED]> wrote: Cheers, Manthosh Kumar. T
Efficient? Probably not ;) But it's like if you are connecting your client app to a webserver local to the cluster and then the webserver connects to the cluster.
I don't like the idea of having the cluster accessible from the outsite and usuall prefer to have kind of a gateway, but that's your call.
You efficiency will mainly depend on the RPC calls you are doing. If you send or retreive big bunch of data at a time should not be that bad. But if you get cells one by one and send edits one by one, might not be very good.
JM 2014-04-03 9:57 GMT-04:00 Manthosh Kumar T <[EMAIL PROTECTED]>:
Hi Jean, Thanks. I might sound a bit lame. Can you just elaborate on the gateway part?. What is the best practice? On 3 April 2014 19:30, Jean-Marc Spaggiari <[EMAIL PROTECTED]> wrote: Cheers, Manthosh Kumar. T
NEW: Monitor These Apps!
Apache Lucene, Apache Solr and all other Apache Software Foundation projects and their respective logos are trademarks of the Apache Software Foundation.
Elasticsearch, Kibana, Logstash, and Beats are trademarks of Elasticsearch BV, registered in the U.S. and in other countries. This site and Sematext Group is in no way affiliated with Elasticsearch BV.
Service operated by Sematext