Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Re: How to tell my Hadoop cluster to read data from an external server


Copy link to this message
-
Re: How to tell my Hadoop cluster to read data from an external server
you are looking at a two step workflow here

first unit of your workflow will download the file from external server and
write it to DFS and return the file path
second unit of your workflow will read the input path and process the data
according to your business logic in MR

you can look at cascading for this simple approach. Its easy to build
simple workflow application using cascading.
other options being oozie or you may try crunch (its very new but easy to
use as well)

On Tue, Mar 26, 2013 at 2:49 PM, Agarwal, Nikhil
<[EMAIL PROTECTED]>wrote:

>  Hi,****
>
> ** **
>
> I have a Hadoop cluster up and running. I want to submit an MR job to it
> but the input data is kept on an external server (outside the hadoop
> cluster). Can anyone please suggest how do I tell my hadoop cluster to load
> the input data from the external servers and then do a MR on it ?****
>
> ** **
>
> Thanks & Regards,****
>
> Nikhil****
>

--
Nitin Pawar