SLURM automatically limit memory/cpu usage depending on GRES

Question

Given that a single node has multiple GPUs, is there a way to automatically limit CPU and memory usage depending on the number of GPUs requested?

In particular, if the users job script requests 2 GPUs then the job should automatically be restricted to 2*BaseMEM and 2*BaseCPU, where BaseMEM = TotalMEM/numGPUs and BaseCPU=numCPUs/numGPUs, which would be defined on a per node basis.

Is it possible to configure SLURM this way? If not, can one alternatively "virtually" split a multi-GPU machine into multiple nodes with the appropriate CPU and MEM count?

donaldsa18 · Accepted Answer

On the command line

--cpus-per-gpu $BaseCPU --mem-per-gpu $BaseMEM

In slurm.conf

DefMemPerGPU=1234
DefCpuPerGPU=1

Since you can't use variables in slurm.conf, you would need to write a little bash command to calculate $BaseCPU and $BaseMEM

SLURM automatically limit memory/cpu usage depending on GRES

Tags:

slurm

Hyperplane

1 Answers

donaldsa18

Recent Activity

Donate For Us

SLURM automatically limit memory/cpu usage depending on GRES

Tags:

slurm

Hyperplane

1 Answers

donaldsa18

Related questions

Recent Activity

Donate For Us