Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I use Multiple GPU's During Model training on Kaggle

On Kaggle I have 2 GPU T4, but I don't understand how I can use them in Pytorch or adapt the code to train on 2 gpu's

pic of 2 gpu's

My training code:

for epoch in range(2):

    running_loss = 0.0
    for data in tqdm(dataset):
        inputs, labels = data
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
like image 306
Ilya Rudenko Avatar asked Dec 06 '25 13:12

Ilya Rudenko


1 Answers

To run on multiple GPU:s you must adopt the training to run distributed training. Pytorch has documentation regarding the area here:

https://pytorch.org/tutorials/distributed/home.html

From there you can find what case fits you the best.

I also found another source which goes into the distributed training area: https://saturncloud.io/blog/how-to-use-multiple-gpus-in-pytorch/

There they talk about three different methods, where the first (Data parallelism) is the most common for simpler and smaller models, as it is the easiest to adapt to.

like image 145
Albin Sidås Avatar answered Dec 09 '25 05:12

Albin Sidås