Does AWS provide any storage solutions that satisfy the following criteria?
/mntIn my mind, an NFS-like volume should satisfy all three, but I don't know if EBS, EFS and/or EMRFS can be used that way. At a minimum I am looking for something that gives me (1) and (2)
In the context of the questions above, I looked into EBS, but I found conflicting information on this topic.
The EMR documentation says that EBS volumes are ephemeral in EMR:
Amazon EBS works differently within Amazon EMR than it does with regular Amazon EC2 instances. Amazon EBS volumes attached to EMR clusters are ephemeral: the volumes are deleted upon cluster and instance termination (for example, when shrinking instance groups), so it’s important that you not expect data to persist
Meanwhile I see an option called "Delete on termination" in EBS that could be set to False, see the screenshot below.

EFS is the service you are looking for. You can mount it on EC2 nodes running in multiple Availability Zones in the same region.
The EC2 instances mount Amazon EFS file systems via the NFSv4 protocol, using standard operating system mount commands.
You can also mount the EFS on every node of EMR through a bootstrap script.
It will satisfy all three criteria for you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With