Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> Encryption in HDFS


+
Seonyeong Bak 2013-02-26, 05:10
+
java8964 java8964 2013-02-26, 19:52
Copy link to this message
-
Re: Encryption in HDFS
You can encrypt the splits separately.

The issue of key management is actually a layer above this.

Looks like the research is on the encryption process w a known key.
The layer above would handle key management which can be done a couple of different ways...

On Feb 26, 2013, at 1:52 PM, java8964 java8964 <[EMAIL PROTECTED]> wrote:

> I am also interested in your research. Can you share some insight about the following questions?
>
> 1) When you use CompressionCodec, can the encrypted file split? From my understand, there is no encrypt way can make the file decryption individually by block, right?  For example, if I have 1G file, encrypted using AES, how do you or can you decrypt the file block by block, instead of just using one mapper to decrypt the whole file?
> 2) In your CompressionCodec implementation, do you use the DecompressorStream or BlockDecompressorStream? If BlockDecompressorStream, can you share some examples? Right now, I have some problems to use BlockDecompressorStream to do the exactly same thing as you did.
> 3) Do you have any plan to share your code, especially if you did use BlockDecompressorStream and make the encryption file decrypted block by block in the hadoop MapReduce job.
>
> Thanks
>
> Yong
>
> From: [EMAIL PROTECTED]
> Date: Tue, 26 Feb 2013 14:10:08 +0900
> Subject: Encryption in HDFS
> To: [EMAIL PROTECTED]
>
> Hello, I'm a university student.
>
> I implemented AES and Triple DES with CompressionCodec in java cryptography architecture (JCA)
> The encryption is performed by a client node using Hadoop API.
> Map tasks read blocks from HDFS and these blocks are decrypted by each map tasks.
> I tested my implementation with generic HDFS.
> My cluster consists of 3 nodes (1 master node, 3 worker nodes) and each machines have quad core processor (i7-2600) and 4GB memory.
> A test input is 1TB text file which consists of 32 multiple text files (1 text file is 32GB)
>
> I expected that the encryption takes much more time than generic HDFS.
> The performance does not differ significantly.
> The decryption step takes about 5-7% more than generic HDFS.
> The encryption step takes about 20-30% more than generic HDFS because it is implemented by single thread and executed by 1 client node.
> So the encryption can get more performance.
>
> May there be any error in my test?
>
> I know there are several implementation for encryting files in HDFS.
> Are these implementations enough to secure HDFS?
>
> best regards,
>
> seonpark
>
> * Sorry for my bad english

+
Mathias Herberts 2013-02-26, 06:43
+
Seonyeong Bak 2013-02-28, 15:28