Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # user - curator service discovery, search to Select a service intance

Copy link to this message
Re: curator service discovery, search to Select a service intance
Jordan Zimmerman 2013-01-28, 23:59
Thinking more about your usage…

It sounds like you will use your own ID scheme so the payload for getChildren() will be much smaller. This is probably more do-able than I originally thought. Further, if you run into performance problems you can decide to not use a ServiceProvider (which uses a cache internally). Instead, you can query directly each time by calling:

ServiceDiscovery.queryForInstance(String name, String id)

You can treat this as a binary search. If the id you try isn't available, search again using a bsearch algorithm.

Just a thought…


On Jan 28, 2013, at 3:43 PM, Jordan Zimmerman <[EMAIL PROTECTED]> wrote:

> It's really up to you. The only thing to be concerned with is the Jute transport limits. Of course, you can always increase this. See jute.maxbuffer here: http://zookeeper.apache.org/doc/r3.4.5/zookeeperAdmin.html
> For Curator-Discovery you can calculate the space you need. See here: https://github.com/Netflix/curator/wiki/Service-Discovery - Curator-Discovery writes nodes with their UUID name. Each UUID node will be 36 bytes. So, a getChildren() call on 10K nodes is 10K as 36 or ~360K. So, you're well under the 1MB limit there. The main issue is if you use a ServiceCache the initial load will require 10K+ ZK calls. This will probably be acceptable. On a gigabit network this won't take too long. Bear in mind that there will also be 10K+ watchers in ZK. Again, ZK should handle this fine.
> Do you expect to grow an order of magnitude from 10K? If so, you might consider clustering. If 10K is the limit you should be fine.
> -JZ
> On Jan 28, 2013, at 3:34 PM, Yasin <[EMAIL PROTECTED]> wrote:
>> This will work if we think each rack as a different service. But now the
>> service retrieval becomes problematic. Event though I think each service as
>> a different service, I should think all the service as a single service. We
>> still need to keep on that id numbering thing. I mean rack1 might have ids
>> from 1..32, and rack2 might have 33..64. So I still need to find a node
>> whose id is best for the client, either has the same number or next smallest
>> one.
>> Another approach would be to classify nodes into, say 10 clusters, and put
>> each node in an appropriate znode. For example,
>> root/cluster1_1000/node1,...,root/cluster1_1000/node1000,  
>> root/cluster1001_2000/node1,...,root/cluster1001_2000/node1000, and soon. If
>> the service gets "getInstance(245)" it will know that the desired node will
>> be in root/cluster1001_2000/. Then I will use some heuristics to retrieve
>> some nodes, sort them based on their ids, and find the most appropriate one.
>> How about this idea?
>> Thanks
>> Yasin
>> --
>> View this message in context: http://zookeeper-user.578899.n2.nabble.com/curator-service-discovery-search-to-Select-a-service-intance-tp7578447p7578456.html
>> Sent from the zookeeper-user mailing list archive at Nabble.com.