Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Copy files from remote folder to HDFS


Copy link to this message
-
Re: Copy files from remote folder to HDFS
Mohammad Tariq 2013-01-25, 04:56
Hello Panshul,

      You might find flume <http://flume.apache.org/> useful.

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com
On Fri, Jan 25, 2013 at 6:39 AM, Panshul Whisper <[EMAIL PROTECTED]>wrote:

> Hello,
>
> I am trying to copy files, Json files from a remote folder - (a folder on
> my local system, Cloudfiles folder or a folder on S3 server) to the HDFS of
> a cluster running at a remote location.
> The job submitting Application is based on Spring Hadoop.
>
> Can someone please suggest or point me in the right direction for best
> option to achieve the above task:
> 1. Use Spring Integration data pipelines to poll the folders for files and
> copy them to the HDFS as they arrive in the source folder. - I have tried
> to implement the solution in Spring Data book, but it does not run - no
> idea what is wrong as it does not generate logs.
>
> 2. Use some other script method to transfer files.
>
> Main requirement, I need to transfer files from a remote folder to HDFS
> everyday at a fixed time for processing in the hadoop cluster. These files
> are collecting from various sources in the remote folders.
>
> Please suggest an efficient approach. I have been searching and finding a
> lot of approaches but unable to decide what will work best. As this
> transfer needs to be as fast as possible.
> The files to be transferred will be almost 10 GB of Json files not more
> than 6kb each file.
>
> Thanking You,
>
>
> --
>  Regards,
> Ouch Whisper
> 010101010101
>