-Re: Moving data from a remote machine into the HDFS
Mohammad Tariq 2012-01-24, 19:28
Thanks a lot for the valuable reply.I have to go with the 2nd option
as there is no access for the remote machine to our HDFS.I'll go
through the link specified by you and act accordingly.
On Wed, Jan 25, 2012 at 12:49 AM, Harsh J <[EMAIL PROTECTED]> wrote:
> You have two ways:
> A. If remote node has access to all HDFS machines (NN + all DNs).
> Simply do a "hadoop dfs -put" operation to push in data.
> B. If remote node has no access to HDFS, setup a bastion box with Hoop
> and write to HDFS via Hoop. Hoop provides a REST API to do this.
> Some examples to write can be found here:
> http://cloudera.github.com/hoop/docs/latest/HttpRestApi.html (See
> section "File System Operations", and the Write example).
> The box you setup must be accessible by the remote node, and the box
> itself should be able to access your HDFS in a regular fashion (NN +
> all DNs), so that it can relay your writes.
> Hoop also has security support, so you can use it against secured
> clusters and prevent writes from non-authed folks. The same link
> carries instructions for this as well.
> On Wed, Jan 25, 2012 at 12:31 AM, Mohammad Tariq <[EMAIL PROTECTED]> wrote:
>> Hey Ron,
>> Thanks for the response.No, the remote machine is not a part of our
>> Hadoop ecosystem.
>> Mohammad Tariq
>> On Tue, Jan 24, 2012 at 10:23 PM, Ronald Petty <[EMAIL PROTECTED]> wrote:
>>> Is this remote machine part of the HDFS system?
>>> On Tue, Jan 24, 2012 at 7:30 AM, Mohammad Tariq <[EMAIL PROTECTED]> wrote:
>>>> Hello list,
>>>> I have a situation wherein I have to move large binary files(~TB)
>>>> from remote machines into the HDFS.While looking for some way to do
>>>> this I came across Hoop.Could anyone tell me whether it fits into my
>>>> use case?If so where can I find some proper help so that I can learn
>>>> about Hoop in detail or some place where I can find some demo apps or
>>>> some code that perform similar kind of tasks?I am going through the
>>>> documentation at
>>>> http://cloudera.github.com/hoop/docs/latest/index.html.But it
>>>> basically talks about configuration stuff.Need some help.Many thanks.
>>>> Mohammad Tariq
> Harsh J
> Customer Ops. Engineer, Cloudera