Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do ds and ds_nodash macros return yesterday's date?

If the cron expression for my Airflow DAG is: 30 0 * * *, then why do my DAG runs show an execution date of the previous day?

I am using Airflow 1.10.10. In the DAG, I have PostgresOperators running SQL on a database. The SQL contains filters on a date column, and I'm filtering using the {{ ds_nodash }} macro. But, the ds_nodash macro resolves to yesterday!

Here's the webserver view of the dag run dates:

Here you can see the start date and execution date

  • (I'm assuming that the date in the Run Id (scheduled__2021-02-21T00:30:00+00:00), is the DAG run's execution date based on the behavior I describe above.)

My expectation is that the execution date date should be the same or very close to the start date based on the cron interval expression. Is my assumption incorrect? If so, why?

like image 626
cdabel Avatar asked Oct 27 '25 14:10

cdabel


1 Answers

As you described the run_id is created using the execution_date. Your SQL query probably needs to be:

WHERE date_col BETWEEN {{ ds_nodash }} AND {{ next_ds_nodash }}

The reason for this is because in ETLs you specify the window you want to query on but this window is accessible only at the end of the interval. Thus resulting that the run of 2021-02-21 can actually be executed only on 2021-02-22.

Possibly this answer may provide more information about the scheduling.

Since this is quite confusing for many users there is a discussion in the dev mailing list to address this issue. So this will change in future Airflow versions.

like image 196
Elad Kalif Avatar answered Oct 30 '25 15:10

Elad Kalif



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!