Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # user - Can ZK be used for my use case?


Copy link to this message
-
RE: Can ZK be used for my use case?
Tavi 2013-09-19, 19:44
Hi,

First of all I want to thank you for your answers.

I understand the limits of the nodes sizes and I don't intend to use ZK to
transfer my data, which, by the way, can achieve a few gigabytes.

I need something to act as an orchestration chef (Coordination System), to
announce a new version to all my actives clients (the leader will inform his
clients about the version number and, maybe, some brief information about
the files to be downloaded and handled ... ).

My first idea was to create an central Web Service who will provide my
updates information and who will receive a confirmation from the client who
was finished the treatment. But all the logic that I needed seems to be
included in ZK core (communication, group handling, authentication .... ).
On my FTP server I can have around 40k files. A full transfer can be done in
max 30 minutes.

My number of clients can change dynamically but the leader will know in
advance if a new client is installed or is deleted. In fact I was thinking
that the client will be a java daemon manually installed on the client side,
configured to connect at my main ZK server to watch a dedicated node where
the leader will store his specific update information.

The network fluctuations are a real problem, but I saw that the ZK-API
provide the logic to handle this situation.

Regarding the frequency of changing,  once a day a new version will be
created on my central FTP site, but usually only a few files will be changed
(max. 2MB each). Once a month a big file must (1 to 10 GB) be actualized on
the clients side.
If the client miss a modification, this can be a problem ....

In fact Rakesh gave me an idea :    
- First, the leader will create the "hierarchical namespace", something like
/application_1/A_group/client_a1
/application_1/A_group/client_a2
      ....
/application_x/y_group/client_yn
/application_x/versions/1
/application_x/versions/2

where "client_yn" is a dedicated node for a single client.
- each node will provide a file (XML, json or other), let's say
"status.data.xml"
- when a new version is created, the leader will update the "version number"
in every "status.data.xml" and it will create a new node in
"/application_x/versions/N" having another information file with the summary
of changes (names of the ftp files to be added, replaces or to be deleted on
the client).
- when a client start, it will read his file content ("status.data.xml") it
will compare the latest version number with the one saved locally, and, if
necessary, it will access  the "/application_x/versions/N" repository to get
the all the information  about the version to be handled. Every each
treatment he will write back in his "status.data.xml" the time when the
process of conversion was finished. At the end he will add a watch to his
"status.data.xml" data file.

Is this scenario a valid one, or I misunderstood the use of ZK?

Thanks again,
    Tavi
--
View this message in context: http://zookeeper-user.578899.n2.nabble.com/Can-ZK-be-used-for-my-use-case-tp7579049p7579121.html
Sent from the zookeeper-user mailing list archive at Nabble.com.