Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SLURM automatically limit memory/cpu usage depending on GRES

Tags:

slurm

Given that a single node has multiple GPUs, is there a way to automatically limit CPU and memory usage depending on the number of GPUs requested?

In particular, if the users job script requests 2 GPUs then the job should automatically be restricted to 2*BaseMEM and 2*BaseCPU, where BaseMEM = TotalMEM/numGPUs and BaseCPU=numCPUs/numGPUs, which would be defined on a per node basis.

Is it possible to configure SLURM this way? If not, can one alternatively "virtually" split a multi-GPU machine into multiple nodes with the appropriate CPU and MEM count?

like image 628
Hyperplane Avatar asked Sep 13 '25 00:09

Hyperplane


1 Answers

On the command line

--cpus-per-gpu $BaseCPU --mem-per-gpu $BaseMEM

In slurm.conf

DefMemPerGPU=1234
DefCpuPerGPU=1

Since you can't use variables in slurm.conf, you would need to write a little bash command to calculate $BaseCPU and $BaseMEM

like image 56
donaldsa18 Avatar answered Sep 16 '25 03:09

donaldsa18