I'm building an application that needs to store sensitive information, which means the data is encrypted on my database so that a hacker/employee with access to the database cannot decipher the sensitive data. However, it still needs to be searchable (on a certain level).
I understand certain compromises may have to be made. For example, I'm willing to leave some data attributes unencrypted to make them indexable if necessary, but "the main body" must be encrypted.
What are some best practices and approaches for storing sensitive data that needs to be viewable, searchable, and/or sortable by authorized people?
(I was thinking of extracting non stop words from the "body" and putting them in random order in a field before encrypting the body, and then feed that field to a search indexer, I doubt it provides any real security.)
Update: You'll want to check out CipherSweet instead of rolling your own design. It takes care of a lot of subtle security details and has a straightforward security argument.
Hash functions aren't the solution here. As the accepted answer suggests, indexing encrypted data requires a "blind index", facilitated by a MAC.
Let's say you're encrypting social security numbers. When you insert them into the database, you might do something like this:
$ssn_encrypted = \Defuse\Crypto\Crypto::encrypt($ssn, $our_encryption_key);
$ssn_blind_idx = \hash_hmac('sha512', $ssn, $our_search_key);
And then store both values in the database. When you need to quickly grab a value based on an SSN input, you can recalculate the HMAC and search based on that.
The database never sees the SSN, and your encryption keys should never be checked into source control (SVN, git, etc.).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With