I want to create an emr cluster triggered via Airflow on Amazon EMR. The emr cluster shows up in the UI of Amazon EMR but with an error saying: "The VPC/subnet configuration was invalid: Subnet is required : The specified instance type m5.xlarge can only be used in a VPC"
Below is the code snippet and the config details in json format for this task that are used in the Airflow script.
My question is how can I incorporate the information (id codes) about VPC and subnet in the json (if this is even possible)? there are no explicit examples out there.
Hint: a network and an EC2 subnet is already created
JOB_FLOW_OVERRIDES = {
"Name": "sentiment_analysis",
"ReleaseLabel": "emr-5.33.0",
"Applications": [{"Name": "Hadoop"}, {"Name": "Spark"}], # We want our EMR cluster to have HDFS and Spark
"Configurations": [
{
"Classification": "spark-env",
"Configurations": [
{
"Classification": "export",
"Properties": {"PYSPARK_PYTHON": "/usr/bin/python3"}, # by default EMR uses py2, change it to py3
}
],
}
],
"Instances": {
"InstanceGroups": [
{
"Name": "Master node",
"Market": "SPOT",
"InstanceRole": "MASTER",
"InstanceType": "m5.xlarge",
"InstanceCount": 1,
},
{
"Name": "Core - 2",
"Market": "SPOT", # Spot instances are a "use as available" instances
"InstanceRole": "CORE",
"InstanceType": "m5.xlarge",
"InstanceCount": 2,
},
],
"KeepJobFlowAliveWhenNoSteps": True,
"TerminationProtected": False, # this lets us programmatically terminate the cluster
},
"JobFlowRole": "EMR_EC2_DefaultRole",
"ServiceRole": "EMR_DefaultRole",
}
create_emr_cluster = EmrCreateJobFlowOperator(
task_id="create_emr_cluster",
job_flow_overrides=JOB_FLOW_OVERRIDES,
aws_conn_id="aws_default",
emr_conn_id="emr_default",
dag=dag,
)
EmrCreateJobFlowOperator calls create_job_flow from emr.py which matches the same api from boto3 emr client.
Therefore you can put an item "Ec2SubnetId" with your subnet id as value within the "Instances" dictionary.
This works for me on Apache Airflow 2.0.2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With