Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to debug gunicorn [6383] [CRITICAL] WORKER TIMEOUT?

In my busy Django 1.8 site, I get loads of 502 errors due to gunicorn worker timeout:

[2019-06-11 04:56:29 +0000] [6383] [CRITICAL] WORKER TIMEOUT (pid:6550)
[2019-06-11 04:56:31 +0000] [6383] [CRITICAL] WORKER TIMEOUT (pid:6439)
[2019-06-11 04:56:31 +0000] [6383] [CRITICAL] WORKER TIMEOUT (pid:7210)
[2019-06-11 04:56:33 +0000] [6383] [CRITICAL] WORKER TIMEOUT (pid:6429)
[2019-06-11 04:56:46 +0000] [6383] [CRITICAL] WORKER TIMEOUT (pid:6562)
[2019-06-11 04:59:41 +0000] [6383] [CRITICAL] WORKER TIMEOUT (pid:6560)

gunicorn.version 19.9.0

Here is my guniconrn.sh configuration

#!/bin/bash

NAME="myapp"                                  
SOCKFILE=/tmp/gunicorn.sock   
USER=myuser                                       
GROUP=www-data                                   
NUM_WORKERS=48                                    
DJANGO_SETTINGS_MODULE=myapp.settings             
DJANGO_WSGI_MODULE=myapp.wsgi                     
MAX_REQ=20000
REQ_TIMEOUT=10
LOG_FILE=/var/log/gunicorn/error.log

echo "Starting $NAME as `whoami`"


cd $DJANGODIR
source /home/myuser/.myappenv/bin/activate
export DJANGO_SETTINGS_MODULE=$DJANGO_SETTINGS_MODULE
export PYTHONPATH=$DJANGODIR:$PYTHONPATH

# Create the run directory if it doesn't exist
RUNDIR=$(dirname $SOCKFILE)
test -d $RUNDIR || mkdir -p $RUNDIR


exec /home/myuser/.myappenv/bin/gunicorn ${DJANGO_WSGI_MODULE}:application \
  --name $NAME \
  --workers $NUM_WORKERS \
  --user=$USER --group=$GROUP \
  --bind=unix:$SOCKFILE \
  --log-level=error \
  --log-file $LOG_FILE \
   --max-requests=$MAX_REQ \
  --timeout=$REQ_TIMEOUT 
  --worker-class="egg:meinheld
#  --worker-class=eventlet
   --threads=2000`

The server has 128GB of RAM and a 24 core CPU.

The error usually happens when the load is +20

I have tweaked a lot of parameters from NUM_WORKERS, REQ_TIMEOUT, worker-class and threads. But none seem to have much effect. So I've ran out of ideas and appreciate your hints.

like image 920
Milkyway Avatar asked Oct 26 '25 18:10

Milkyway


2 Answers

For the record, my problem was not with gunicorn but with redis, which is used heavily to cache data.

As the cache is grown several hundred MB, and appendfsync everysec was active, it took more than 1sec to write to disk hence blocked gunicorn processes. So after commenting that out and using appendfsync no saving policy instead, the problem is gone.

like image 190
Milkyway Avatar answered Oct 29 '25 09:10

Milkyway


You may want to check that your app can connect to its database if applicable. For me I was running a Django REST API in the cloud and had to check the security group on the database server to allow connections but nothing was actually wrong with the Django+Gunicorn deployment.

like image 40
oz21m Avatar answered Oct 29 '25 08:10

oz21m



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!