Info:
CMD gunicorn -b 0.0.0.0:5000 --access-logfile - "app:create_app()"ECSTaskDefinition:
Type: AWS::ECS::TaskDefinition
Properties:
Cpu: "256"
Memory: "1024"
RequiresCompatibilities:
- FARGATE
ContainerDefinitions:
- Name: contained_above
.
.
.
ECSService:
Type: AWS::ECS::Service
DependsOn: ListenerRule
Properties:
Cluster: !Sub "${EnvName}-ECScluster"
DesiredCount: 1
LaunchType: FARGATE
DeploymentConfiguration:
MaximumPercent: 200
MinimumHealthyPercent: 50
NetworkConfiguration:
AwsvpcConfiguration:
AssignPublicIp: ENABLED
Subnets:
- Fn::ImportValue: !Sub "${EnvName}-PUBLIC-SUBNET-1"
- Fn::ImportValue: !Sub "${EnvName}-PUBLIC-SUBNET-2"
SecurityGroups:
- Fn::ImportValue: !Sub "${EnvName}-CONTAINER-SECURITY-GROUP"
ServiceName: !Sub "${EnvName}-ECS-SERVICE"
TaskDefinition: !Ref ECSTaskDefinition
LoadBalancers:
- ContainerName: contained_above
ContainerPort: 5000
TargetGroupArn: !Ref TargetGroup
(App is working normally)
Question
Now my question is what number should be the workers on gunicorn command (my last command in dockerfile)?
On gunicorn design it is stated to use Generally we recommend (2 x $num_cores) + 1 as the number of workers to start off with.
So whats the number of cores on a fargate? Does actually make sense to combine gunicorn with Fargate like the above process? Is there 'compatibility' between loadbalancers and gunicorn workers? What is the connection between DesiredCount of ECS Service and the gunicorn -w workers value? Am I missing or miss-understanding something?
Possible solution(?)
One way that I could call it is the following:
CMD gunicorn -b 0.0.0.0:5000 -w $(( 2 * `cat /proc/cpuinfo | grep 'core id' | wc -l` + 1 )) --access-logfile - "app:create_app()"
But I am not sure if that would be a good solution. Any insights? Thanks
EDIT: I'm using a configuration file for gunicorn to use when starting:
gunicorn.conf.py
import multiprocessing
bind = "0.0.0.0:8080"
workers = multiprocessing.cpu_count() * 2 + 1
worker_class = "uvicorn.workers.UvicornH11Worker"
keepalive = 0
you can tell gunicorn which config file to use with the --config flag.
Sadly I can't find the source anymore, but I've read that 4-12 workers should be enough to handle hundreds if not thousands of simultaneous requests - depending on your application structure, worker class and payload size. Do take this with a grain of salt tho, since I can't find the source anymore, but it was in an accepted SO answer from a well-reputated person if I remember correctly.
Offical gunicorn docs state somthing in the 2-4 x $(NUM_CORES)range. Another option would be as gunicorn docs state at another point:
Generally we recommend (2 x $num_cores) + 1 as the number of workers to start off with. While not overly scientific, the formula is based on the assumption that for a given core, one worker will be reading or writing from the socket while the other worker is processing a request.
Obviously, your particular hardware and application are going to affect the optimal number of workers. Our recommendation is to start with the above guess and tune using TTIN and TTOU signals while the application is under load.
So far I've been running well with holding true to the 4-12 worker recommendation. My company runs several APIs, which connect to other APIs out there, which results in mostly 1-2seconds request time, with the longest taking up to a whole minute (a lot of external API calls here).
Another colleague I talked to mentioned, they are using 1 worker per 5 simultaneous requests they expect - with similar APIs to ours. Works fine for them as well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With