Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # dev >> libhdfs portability


+
Kyle Sletmoe 2013-10-28, 22:38
+
Haohui Mai 2013-10-28, 23:14
+
Kyle Sletmoe 2013-10-28, 23:24
Copy link to this message
-
Re: libhdfs portability
On Mon, Oct 28, 2013 at 4:24 PM, Kyle Sletmoe
<[EMAIL PROTECTED]> wrote:
> I have written a WebHDFSClient and I do not believe that reusing
> connections is enough to noticeably speed up transfers in my case. I did
> some tests and on average it took roughly 14 minutes to transfer a 3.6 GB
> file to an HDFS on my local network (I tried the same operation using cURL,
> with similar results). I tried transferring the exact same file with the
> hdfs->dfs->copyFromLocal command, and it took on average 40 seconds. I need
> to be able to reliably transfer files that are in the 250 GB - 1TB range,
> and I really need the speed afforded by the "direct" transferring method
> that libhdfs uses. Does libhdfs work with Hadoop 2.2.0 (if I use it in
> Linux)?

libhdfs is the basis of a lot of software built on top of HDFS, such
as Impala and fuse_dfs, and yes, it works.

Patches that improve portabilty are welcome.  However, rather than
#ifdefs, I would rather see platform-specific files that implement
whatever functionality is platform-specific.

Another option for you is to use the new NFS v3 gateway included in
Hadoop 2.  I have heard that newer version of Windows finally include
some kind of NFS support.  (However, older versions, such as Windows
XP, do not have this support).

best,
Colin
>
> --
> Kyle Sletmoe
>
> *Urban Robotics Inc.**
> *Software Engineer
>
> 33 NW First Avenue, Suite 200 | Portland, OR 97209
> c: (541) 621-7516 | e: [EMAIL PROTECTED]
>
> http://www.urbanrobotics.net
>
>
> On Mon, Oct 28, 2013 at 4:14 PM, Haohui Mai <[EMAIL PROTECTED]> wrote:
>
>> I believe that the WebHDFS API is your best bet for now. The current
>> implementation of WebHDFSClient does not reuse the HTTP connections, which
>> leads to a large part of the performance penalty.
>>
>> You might want to implement your own version that reuses HTTP connection to
>> see whether it meets your performance requirements.
>>
>> Thanks,
>> Haohui
>>
>>
>> On Mon, Oct 28, 2013 at 3:38 PM, Kyle Sletmoe <
>> [EMAIL PROTECTED]> wrote:
>>
>> > Now that Hadoop 2.2.0 is Windows compatible, is there going to be work on
>> > creating a portable version of libhdfs for C/C++ interaction with HDFS? I
>> > know I can use the WebHDFS REST API, but the data transfer rates are
>> > abysmally slow compared to the direct interaction via libhdfs.
>> >
>> > Regards,
>> > --
>> > Kyle Sletmoe
>> >
>> > *Urban Robotics Inc.**
>> > *Software Engineer
>> >
>> > 33 NW First Avenue, Suite 200 | Portland, OR 97209
>> > c: (541) 621-7516 | e: [EMAIL PROTECTED]
>> >
>> > http://www.urbanrobotics.net
>> >
>> > --
>> > *Information contained herein is subject to the Code of Federal
>> Regulations
>> > Chapter 22 International Traffic in Arms Regulations. This data may not
>> be
>> > resold, diverted, transferred, transshipped, made available to a foreign
>> > national within the United States, or otherwise disposed of in any other
>> > country outside of its intended destination, either in original form or
>> > after being incorporated through an intermediate process into other data
>> > without the prior written approval of the US Department of State.
>> >  **Penalties
>> > for violation include bans on defense and military work, fines and
>> > imprisonment.*
>> >
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>
>
> --
> *Information contained herein is subject to the Code of Federal Regulations
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB