I am confused by what airflow does if a dagrun fails. The behaviour I want to achieve is:
These are my dag arguments and task arguments:
task defaults:
'depends_on_past': True,
'start_date': airflow.utils.dates.days_ago(2),
'email': ['[email protected]'],
'email_on_failure': True,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=5),
'wait_for_downstream': False,
dag arguments:
schedule_interval=timedelta(minutes=60),
catchup=False,
max_active_runs=1
I think I am misunderstanding some of these arguments because it appears to me that if a task fails n times (i.e. the dagrun fails), then the next dagrun gets scheduled but just sits in the running state forever and no further dagruns ever succeed (or even get scheduled). for example, here are the dagruns (I didn't know where to find the text based scheduler logs like in this question) where the dags are scheduled to run every 5 minutes instead of every hour:

The execution runs every 5 minutes until the failure, after that the last execution is just in the running state and has been so for the past 30 minutes.
What have I done wrong?
I should add that restarting the scheduler doesn't help and neither does manually setting that running task to failed...
You have depends_on_past set to True, which is preventing the next DagRun from being started.
From the docs: depends_on_past (bool) – when set to true, task instances will run sequentially while relying on the previous task’s schedule to succeed. The task instance for the start_date is allowed to run.
This means that your Dag is trying to run, but it is waiting until the corresponding task from the previous DagRun has a success state.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With