Suppose, I use 2 gpus in a DDP setting.
So, if I intend to use 16 as a batch size if I run the experiment on a single gpu,
should I give 8 as a batch size, or 16 as a batch size in case of using 2 gpus with DDP setting??
Does 16 is divided into 8 and 8 automatically?
Thank you -!
No, it won't be split automatically.
When you set batch_size=8 under DDP mode, each GPU will receive dataset with batch_size=8, so the global batch_size=16
I don't agree with Deusy94's answer.
If I understand correctly according to pytorch's official example using distributed data parallel (ddp) at line 160:
args.batch_size = int(args.batch_size / ngpus_per_node)
the batch size when you instantiate the DataLoader is the batch size for a single process/single node.
Note that in the argparser the comment was:
parser.add_argument('-b', '--batch-size', default=256, type=int,
metavar='N',
help='mini-batch size (default: 256), this is the total '
'batch size of all GPUs on the current node when '
'using Data Parallel or Distributed Data Parallel')
Hence, let's say you have passed --batch-size 16 here and you have two GPUs, the args.batch_size will be updated to 8 manually (divided by number of GPUs) in Line 160 above, and the actual dataloader you created is with batch_size of 8 - which is the dataloader for individual GPUs.
Therefore, if you create dataloader with DataLoader(datasetm batch_size=16), and you start the DDP with 2 GPUs, each GPU will proceed with batch_size=16 and your global batch_size will be 32.
This is different with DataParallel which has a gather/scatter procedure, such that your batch is automatically scattered into equal size of chunks for each GPUs (i.e., DataLoader(datasetm batch_size=16) --> each GPU gets 8).
Eather way, it's quite easy to verify it if you iterate the dataloader with a progress bar (e.g., tqdm) to log how many steps it needed to traverse all batches (i.e., number of batches), and you can always compute which of the equation is True: batch_size * num_batches == dataset_size or num_gpu * batch_size * num_batches == dataset_size.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With