I have an R script that use parallel computing via the parallel and future packages. The code to set the parallel configuration in the R script is:
cl <- parallel::makeCluster(100)
future::plan(cluster, workers=cl)
I am running the R script in HPC where each node have 20 CPU. What is SLRUM configuration to run the R script as one job across multiple nodes. will:
--cpus-per-task=100
be sufficient?
Thank you
By default, if you request N nodes and launch M tasks, the slurm will distribute the M tasks in N nodes. So, if you want to launch 100 tasks across 2 Nodes, you just need to specify --nodes 2 and --ntasks 100. The 100 tasks (100 times your R script will be launched) will be spread across 2 Nodes.
But, if you only want to launch your R script twice (one per each node to utilize shared memory and inside each node allocate 20 cpus for that single task) then you can do, --nodes 2 --ntasks 2 --cpus-per-task 20.
Reading this post, I realized that I can not run my job on multiple nodes since they do not share memory.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With