Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Template_searchpath gives TemplateNotFound error in Airflow and cannot find the SQL script

I have a DAG described like this :

tmpl_search_path = '/home/airflow/gcs/sql_requests/'

with DAG(dag_id='pipeline', default_args=default_args, template_searchpath = [tmpl_search_path]) as dag:

    create_table = bigquery_operator.BigQueryOperator(
        task_id = 'create_table',
        sql = 'create_table.sql',
        use_legacy_sql = False,
        destination_dataset_table = some_table)
    )

The task create_table calls a SQL script create_table.sql. This SQL script is not in the same folder as the DAG folder : it is in a sql_requests folder at the same level as the DAG folder. This is the architecture inside the bucket of the GCP Composer (which is the Google Airflow) is :

bucket_name
|- airflow.cfg
|- dags
   |_ pipeline.py
|- ...
|_ sql_requests
   |_ create_table.sql

What path do I need to set for template_searchpath to reference the folder sql_requests inside the Airflow bucket on GCP ?

I have tried template_searchpath= ['/home/airflow/gcs/sql_requests'], template_searchpath= ['../sql_requests'], template_searchpath= ['/sql_requests'] but none of these have worked.

The error message I get is 'jinja2.exceptions.TemplateNotFound'

like image 635
Valentin Richer Avatar asked Oct 27 '25 00:10

Valentin Richer


2 Answers

According to https://cloud.google.com/composer/docs/concepts/cloud-storage it is not possible to store files that are needed to execute dags elsewhere than in the folders dags or plugins :

To avoid a workflow failure, store your DAGs, plugins, and Python modules in the dags/ or plugins/ folders—even if your Python modules do not contain DAGs or plugins.

This is the reason why I had the TemplateNotFound error.

like image 192
Valentin Richer Avatar answered Oct 29 '25 17:10

Valentin Richer


You can store in mounted/known paths which are dags/plugins OR data

data folder has no capacity limits but it's easy to throw yourself off using it to store anything that the web server would need to read, because the web server can't access that folder (e.g if you put SQL files in /data folder, you would not be able to parse rendered template in the UI, but any tasks that need to access the file during run time would just run fine)

like image 27
the pillow Avatar answered Oct 29 '25 15:10

the pillow



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!