I'm trying to get airflow working to better orchestrate an etl process. When I make changes to a dag in my dags folder, I often have to restart the scheduler with
airflow scheduler
before the changes are visible in the UI. I would like to run the scheduler as a daemon process with
airflow scheduler -D
but we I try to do so, I get a message saying
[2018-10-17 14:13:54,769] {jobs.py:580} ERROR - 
Cannot use more than 1 thread when using sqlite. Setting max_threads to 1
I think this error pops up because the scheduler is already running as a daemon. However, when I try to find out where the scheduler is being run with
lsof -i
I don't get any results.
Question: Why am I not able to restart the scheduler with airflow scheduler -D. Why does the scheduler restart with airflow webserver? How do I successfully kill the process that is preventing me to run airflow scheduler -D?
CLI Check for Scheduler BaseJob with information about the host and timestamp (heartbeat) at startup, and then updates it regularly. You can use this to check if the scheduler is working correctly. To do this, you can use the airflow jobs checks command. On failure, the command will exit with a non-zero error code.
You need to clear out the airflow-scheduler. pid file at $AIRFLOW_HOME. The stale pid file from the daemon will prevent you to start another scheduler process.
If you run Airflow locally and start it with the two commands airflow scheduler and airflow webserver , then those processes will run in the foreground. So, simply hitting Ctrl-C for each of them should terminate them and all their child processes.
Create a init script and use the command "daemon" to run this as service. Show activity on this post. You can use a ready-made AMI (namely, LightningFLow) from AWS Marketplace which provides Airflow services (webserver, scheduler, worker) which are enabled at startup.
Run ps aux | grep airflow and check if airflow webserver or airflow scheduler processes are running. If they are kill them and rerun using airflow scheduler -D
You need to clear out the airflow-scheduler.pid file at $AIRFLOW_HOME. The stale pid file from the daemon will prevent you to start another scheduler process.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With