I am not sure whether this is the right place to ask question like this or not. If not
please advise me, if there is a different forum to place this kind of question.
We have a need to pull files from a different server(s) to HDFS. We need to preserve the
file name. It will be lot easier for us to pull files than installing software on the remote server(s).
We need to expect that the network may have issues sometimes and we may have failures
and may need to continue from where we left off. In such scenario, we can create an extension
to the file indicating we had to do in multiple attempts. We cannot move or rename the files on
the source server. If we are restarting for some reason, we should not copy already copied files.
Previously, we have done this using shell scripting.
We are planning to use Flume. Is there any existing solution in Flume or we need to develop
Denny Ye 2013-02-14, 03:46