Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does Spark support Encryption at Rest?

Hadoop has recently introduced Encryption at Rest (HDFS-6134). I'd like to know whether it's also supported in Spark? I mean can Spark processes data which is stored in encrypted format in HDFS?

like image 852
HHH Avatar asked Sep 06 '25 23:09

HHH


1 Answers

Yes, Spark will be able to access data without any changes to the application code. The data is encrypted transparently to the applications, which means all your Java APIs and command-line interfaces work as before without any changes. The framework will take of encryption without bothering you.

Here is a quote from the documentation:

HDFS implements transparent, end-to-end encryption. Once configured, data read from and written to HDFS is transparently encrypted and decrypted without requiring changes to user application code.

http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html

You will however be required to add/modify some configuration. Here's a worked example.

See also blog.cloudera.com/blog/2015/01/new-in-cdh-5-3-transparent-encryption-in-hdfs

like image 197
Shyam Avatar answered Sep 09 '25 19:09

Shyam