I want to delete a sensitive file (using C++), in a way that the file will not be recoverable.
I was thinking of simply rewriting over the file and then delete it, Is it enough or do I have to perform more actions ?
Here is an interesting paper:
http://www.filesystems.org/docs/secdel/secdel.html
It adresses some issues with overwriting of files. Especially you can't be sure that the newly written data was written to the same location and that it's impossible to recover data that was overwritten just a very few times or even once (on modern media).
Worst case scenario, you can't be sure of having done it without physically destroying the drive. It's possible that you're running on a journaling filesystem, that keeps the original whenever you modify a file to allow disaster recovery if the modification is interrupted by power failure or whatever. This might mean that modifying a file moves it on the physical drive, leaving the old location unchanged.
Furthermore, some filesystems deliberately keep the old version around as long as possible to allow it to be recovered. Consider for example shadow storage copies on Windows, when you modify a disk block that's part of a file that's part of a system restore point, the new data is written to a new block, and the old one is kept around.
There's APIs to disable shadow storage copies for a file, directory or the whole disk (don't know the details, might require admin privilege).
Another gotcha is filesystem-level compression. If you overwrite a file with random data, chances are you make it less compressible and hence larger on disk even though it's still the same logical size. So the filesystem might have to relocate it. I don't know off-hand whether Windows guarantees to continue using the old blocks for the start of the new, larger file or not. If you overwrite with zeros, you make it more compressible, the new data might fail to reach as far as the end of the old data.
If the drive has ever been defragged (IIRC Windows nowadays does this in the background by default), then nothing you do to the file necessarily affects copies of the data in previous locations.
shred and similar tools simply don't work under these fairly common conditions.
Stretching a point, you can imagine a custom filesystem where all changes are journalled, backed up for future rollback recovery, and copied to off-site backup as soon as possible. I'm not aware of any such system (although of course there are automatic backup programs that run above the filesystem level with the same basic effect), but Windows certainly doesn't have an API to say, "OK, you can delete the off-site backup now", because Windows has no idea that it's happening.
This is even before you consider the possibility that someone has special kit that can detect data on magnetic disks even after it's been overwritten with new data. Opinions vary how plausible such attacks really are on modern disks, which are very densely packed so there's not a lot of space for residuals of old values. But it's academic, really, since in most practical circumstances you can't even be sure of overwriting the old data short of unmounting the drive and overwriting each sector using low-level tools.
Oh yeah, flash drives are no better, they perform re-mapping of logical sectors to physical sectors, a bit like virtual memory. This is so that they can cope with failed sectors, do wear-leveling, that sort of thing. So even at low level, just because you overwrite a particular numbered sector doesn't mean the old data won't pop up in some other numbered sector in future.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With