Is it possible to submit and run Spark jobs concurrently in the same AWS EMR cluster ? If yes then could you please elaborate ?
You should use the tag --deploy-mode cluster that will allow you to deploy multiple executions to your cluster. That will make yarn handle the resources and the queues for you.
The full example:
spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master yarn \
  --deploy-mode cluster \  # can be client for client mode
  --executor-memory 20G \
  --num-executors 50 \
  /path/to/examples.jar \
  1000
More details here.
Currently, EMR doesn't support running multiple steps in parallel. As far as I know such experimental feature is already implemented but not released due to some issues.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With