Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> Deprecate hftp / hsftp


Copy link to this message
-
Re: Deprecate hftp / hsftp
The problem with replacing hftp with WebHDFS is caused by both the datanode web interface and the WebHDFS servlet shareing the same jetty container.  If I want to prevent downloads via webhdfs to corportate workstations, I have to firewall the jetty port.  But firewalling the jetty port on the datanode then breaks the HDFS file viewer.   Alternatively I could disable WebHDFS but then I no longer am able to transfer data between different version hadoop clusters.  

I think what's missing from webhdfs is a deny/allow list similar to what httpd has.  This would allow the operations teams to configure WebHDFS so that file transfers only happen between defined networks, and still allow the other jetty servlets to be available as we won't have to firewall the jetty port.

I should probably go search through open tickets to see if deny/allow list has been requested and if not, open a jira. :)

On Nov 26, 2013, at 1:40 PM, Suresh Srinivas <[EMAIL PROTECTED]> wrote:

> Thanks Haohui for all your hard work in this area. I am +1 on this proposal.
>
>
> On Tue, Nov 26, 2013 at 12:50 PM, Haohui Mai <[EMAIL PROTECTED]> wrote:
>
>> Hi,
>>
>> Recently I've been focusing on fixing hftp / hsftp / webhdfs / swebhdfs in
>> various set ups. Now we have reached the state that all the above file
>> systems can work in both secure and insecure clusters, and transfer data
>> through both http and https.
>>
>> Taking a step back, these file systems are very similar, and I'm wondering
>> whether it is a good time to deprecate hftp and hsftp in Hadoop right now.
>>
>> The main reason is that hftp / hsftp only provides a strict subset of
>> functionalities that webhdfs / swebhdfs offer. Notably, webhdfs / swebhdfs
>> support writes and HA which hftp / hsftp do not support. It's more natural
>> to move forward with webhdfs / swebhdfs, keeping both hftp / hsftp around
>> seems introducing more work.
>>
>> Another reason is that webhdfs has been supported since Hadoop 1, thus
>> getting rid of hftp / hsftp does not seem removing any features, even the
>> users are trying to migrate from Hadoop 1 to Hadoop 2.
>>
>>
>> Your ideas are appreciated.
>>
>> Thanks,
>> Haohui
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>
>
>
>
> --
> http://hortonworks.com/download/
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.