On Mon, Oct 28, 2013 at 4:24 PM, Kyle Sletmoe
<[EMAIL PROTECTED]> wrote:
> I have written a WebHDFSClient and I do not believe that reusing
> connections is enough to noticeably speed up transfers in my case. I did
> some tests and on average it took roughly 14 minutes to transfer a 3.6 GB
> file to an HDFS on my local network (I tried the same operation using cURL,
> with similar results). I tried transferring the exact same file with the
> hdfs->dfs->copyFromLocal command, and it took on average 40 seconds. I need
> to be able to reliably transfer files that are in the 250 GB - 1TB range,
> and I really need the speed afforded by the "direct" transferring method
> that libhdfs uses. Does libhdfs work with Hadoop 2.2.0 (if I use it in
libhdfs is the basis of a lot of software built on top of HDFS, such
as Impala and fuse_dfs, and yes, it works.
Patches that improve portabilty are welcome. However, rather than
#ifdefs, I would rather see platform-specific files that implement
whatever functionality is platform-specific.
Another option for you is to use the new NFS v3 gateway included in
Hadoop 2. I have heard that newer version of Windows finally include
some kind of NFS support. (However, older versions, such as Windows
XP, do not have this support).
> Kyle Sletmoe
> *Urban Robotics Inc.**
> *Software Engineer
> 33 NW First Avenue, Suite 200 | Portland, OR 97209
> c: (541) 621-7516 | e: [EMAIL PROTECTED]
> On Mon, Oct 28, 2013 at 4:14 PM, Haohui Mai <[EMAIL PROTECTED]> wrote:
>> I believe that the WebHDFS API is your best bet for now. The current
>> implementation of WebHDFSClient does not reuse the HTTP connections, which
>> leads to a large part of the performance penalty.
>> You might want to implement your own version that reuses HTTP connection to
>> see whether it meets your performance requirements.
>> On Mon, Oct 28, 2013 at 3:38 PM, Kyle Sletmoe <
>> [EMAIL PROTECTED]> wrote:
>> > Now that Hadoop 2.2.0 is Windows compatible, is there going to be work on
>> > creating a portable version of libhdfs for C/C++ interaction with HDFS? I
>> > know I can use the WebHDFS REST API, but the data transfer rates are
>> > abysmally slow compared to the direct interaction via libhdfs.
>> > Regards,
>> > --
>> > Kyle Sletmoe
>> > *Urban Robotics Inc.**
>> > *Software Engineer
>> > 33 NW First Avenue, Suite 200 | Portland, OR 97209
>> > c: (541) 621-7516 | e: [EMAIL PROTECTED]
>> > http://www.urbanrobotics.net
>> > --
>> > *Information contained herein is subject to the Code of Federal
>> > Chapter 22 International Traffic in Arms Regulations. This data may not
>> > resold, diverted, transferred, transshipped, made available to a foreign
>> > national within the United States, or otherwise disposed of in any other
>> > country outside of its intended destination, either in original form or
>> > after being incorporated through an intermediate process into other data
>> > without the prior written approval of the US Department of State.
>> > **Penalties
>> > for violation include bans on defense and military work, fines and
>> > imprisonment.*
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
> *Information contained herein is subject to the Code of Federal Regulations