I'm currently running an airflow (1.9.0) instance on python 3.6.5. I have a manual workflow that I'd like to move to a DAG. This manual workflow now requires code written in python 2 and 3. Let's simplify my DAG to 3 steps:
The dataflow job is written in python 2.7 (required by google) and the tensorflow model code is in python 3. Looking at "MLEngineTrainingOperator" in airflow 1.9.0 there is a python_version parameter which sets the "The version of Python used in training".
Questions:
There isn't a way specify the python version dynamically on a worker. However if you are using the the Celery executor, you can run multiple workers either on difference servers/vms or in different virtual environments.
You can have one worker running python 3, and one running 2.7, and have each listening to different queues. This can be done three different ways:
-q [queue-name] flagAIRFLOW__CELERY__DEFAULT_QUEUEdefault_queue under [celery] in the airflow.cfg. Then in your task definitions specify a queue parameter, changing the queue up depending on which python version the task needs to run.
I'm not familiar with the MLEngineOperator, but you can specify a python_version in the PythonOperator which should run it in a virtualenv of that version. Alternative you can use the BashOperator, write the code to run in a different file and specify the python command to run it using the absolute path the version of python you want to use.
Regardless of how the task is run, you just need to ensure the DAG itself is compatible with python version you are running it as it. ie. if you are going to start an airflow worker in different python versions, the DAG file itself needs to be python 2 & 3 compatible. The DAG can have addition file dependencies that it uses have version incompatibilities.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With