Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # dev - Secure deletion of blocks


Copy link to this message
-
Re: Secure deletion of blocks
Colin McCabe 2013-08-20, 19:43
Just to clarify, ext4 has the option to turn off journalling.  ext3 does
not.  Not sure about reiser.

Colin
On Tue, Aug 20, 2013 at 12:42 PM, Colin McCabe <[EMAIL PROTECTED]>wrote:

> > If I've got the right idea about this at all?
>
> From the man page for wipe(1);
>
> "Journaling filesystems (such as Ext3 or ReiserFS) are now being used by
> default by most Linux distributions. No secure deletion program that does
> filesystem-level calls can sanitize files on such filesystems, because
> sensitive data and metadata can be written to the journal, which cannot be
> readily accessed. Per-file secure deletion is better implemented in the
> operating system."
>
> You might be able to work around this by turning off the journal on these
> filesystems.  But even then, you've got issues like the drive remapping bad
> sectors (and leaving around the old ones), flash firmware that is unable to
> erase less than an erase block, etc.
>
> The simplest solution is probably just to use full-disk encryption.  Then
> you don't need any code changes at all.
>
> Doing something like invoking shred on the block files could improve
> security somewhat, but it's not going to work all the time.
>
> Colin
>
>
> On Thu, Aug 15, 2013 at 5:31 AM, Matt Fellows <
> [EMAIL PROTECTED]> wrote:
>
>> Hi,
>> I'm looking into writing a patch for HDFS which will provide a new method
>> within HDFS which can securely delete the contents of a block on all the
>> nodes upon which it exists. By securely delete I mean, overwrite with
>> 1's/0's/random data cyclically such that the data could not be recovered
>> forensically.
>>
>> I'm not currently aware of any existing code / methods which provide
>> this, so was going to implement this myself.
>>
>> I figured the DataNode.java was probably the place to start looking into
>> how this could be done, so I've read the source for this, but it's not
>> really enlightened me a massive amount.
>>
>> I'm assuming I need to tell the NameServer that all DataNodes with a
>> particular block id would be required to be deleted, then as each DataNode
>> calls home, the DataNode would be instructed to securely delete the
>> relevant block, and it would oblige.
>>
>> Unfortunately I have no idea where to begin and was looking for some
>> pointers?
>>
>> I guess specifically I'd like to know:
>>
>> 1. Where the hdfs CLI commands are implemented
>> 2. How a DataNode identifies a block / how a NameServer could inform a
>> DataNode to delete a block
>> 3. Where the existing "delete" is implemented so I can make sure my
>> secure delete makes use of it after successfully blanking the block contents
>> 4. If I've got the right idea about this at all?
>>
>> Kind regards,
>> Matt Fellows
>>
>> --
>> [image: cid:1CBF4038-3F0F-4FC2-A1FF-6DC81B8B6F94]
>>  First Option Software Ltd
>> Signal House
>> Jacklyns Lane
>> Alresford
>> SO24 9JJ
>> Tel: +44 (0)1962 738232
>> Mob: +44 (0)7710 160458
>> Fax: +44 (0)1962 600112
>> Web: www.b <http://www.fosolutions.co.uk/>espokesoftware.com<http://bespokesoftware.com/>
>>
>> ______________________________**______________________
>>
>> This is confidential, non-binding and not company endorsed - see full
>> terms at www.fosolutions.co.uk/**emailpolicy.html<http://www.fosolutions.co.uk/emailpolicy.html>
>>
>> First Option Software Ltd Registered No. 06340261
>> Signal House, Jacklyns Lane, Alresford, Hampshire, SO24 9JJ, U.K.
>> ______________________________**______________________
>>
>>
>