Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pytorch CPU CUDA device load without gpu

I found this nice code Pytorch mobilenet which I cant get running on a CPU. https://github.com/rdroste/unisal

I am new to Pytorch so I am not shure what to do.

In line 174 of the module train.py the device is set:

device = 'cuda:0' if torch.cuda.is_available() else 'cpu'

which is correct as far as I know about Pytorch.

Do I have to change the torch.load too? I tried with no success.

class BaseModel(nn.Module):
    """Abstract model class with functionality to save and load weights"""

    def forward(self, *input):
        raise NotImplementedError

    def save_weights(self, directory, name):
        torch.save(self.state_dict(), directory / f'weights_{name}.pth')

    def load_weights(self, directory, name):
        self.load_state_dict(torch.load(directory / f'weights_{name}.pth'))

    def load_best_weights(self, directory):
        self.load_state_dict(torch.load(directory / f'weights_best.pth'))

    def load_epoch_checkpoint(self, directory, epoch):
        """Load state_dict from a Trainer checkpoint at a specific epoch"""
        chkpnt = torch.load(directory / f"chkpnt_epoch{epoch:04d}.pth")
        self.load_state_dict(chkpnt['model_state_dict'])

    def load_checkpoint(self, file):
        """Load state_dict from a specific Trainer checkpoint"""
        """Load """
        chkpnt = torch.load(file)
        self.load_state_dict(chkpnt['model_state_dict'])

    def load_last_chkpnt(self, directory):
        """Load state_dict from the last Trainer checkpoint"""
        last_chkpnt = sorted(list(directory.glob('chkpnt_epoch*.pth')))[-1]
        self.load_checkpoint(last_chkpnt)

I don’t get it. Where do I have to tell pytorch there is no gpu ?

complete error:

Traceback (most recent call last):
  File "run.py", line 99, in <module>
    fire.Fire()

  File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/fire/core.py", line 138, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)

  File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/fire/core.py", line 471, in _Fire
    target=component.__name__)

  File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/fire/core.py", line 675, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)

  File "run.py", line 95, in predict_examples
    example_folder, is_video, train_id=train_id, source=source)

  File "run.py", line 72, in predictions_from_folder
    folder_path, is_video, source=source, model_domain=model_domain)

  File "/home/b256/Data/saliency_models/unisal-master/unisal/train.py", line 871, in generate_predictions_from_path
    self.model.load_best_weights(self.train_dir)

  File "/home/b256/Data/saliency_models/unisal-master/unisal/train.py", line 1057, in model
    self._model = model_cls(**self.model_cfg)

  File "/home/b256/Data/saliency_models/unisal-master/unisal/model.py", line 190, in __init__
    self.cnn = MobileNetV2(**self.cnn_cfg)

  File "/home/b256/Data/saliency_models/unisal-master/unisal/models/MobileNetV2.py", line 156, in __init__
    Path(__file__).resolve().parent / 'weights/mobilenet_v2.pth.tar')

  File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/torch/serialization.py", line 367, in load
    return _load(f, map_location, pickle_module)

  File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/torch/serialization.py", line 538, in _load
    result = unpickler.load()

  File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/torch/serialization.py", line 504, in persistent_load
    data_type(size), location)

  File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/torch/serialization.py", line 113, in default_restore_location
    result = fn(storage, location)

  File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/torch/serialization.py", line 94, in _cuda_deserialize
    device = validate_cuda_device(location)

  File "/home/b256/anaconda3/envs/unisal36/lib/python3.6/site-packages/torch/serialization.py", line 78, in validate_cuda_device
    raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location='cpu' to map your storages to the CPU.
like image 927
OK707 Avatar asked May 05 '26 08:05

OK707


1 Answers

In https://pytorch.org/tutorials/beginner/saving_loading_models.html#save-on-gpu-load-on-cpu you'll see there's a map_location keyword argument to send weights to the proper device:

model.load_state_dict(torch.load(PATH, map_location=device))

From the docs https://pytorch.org/docs/stable/generated/torch.load.html#torch.load

torch.load() uses Python’s unpickling facilities but treats storages, which underlie tensors, specially. They are first deserialized on the CPU and are then moved to the device they were saved from. If this fails (e.g. because the run time system doesn’t have certain devices), an exception is raised. However, storages can be dynamically remapped to an alternative set of devices using the map_location argument.

If map_location is a callable, it will be called once for each serialized storage with two arguments: storage and location. The storage argument will be the initial deserialization of the storage, residing on the CPU. Each serialized storage has a location tag associated with it which identifies the device it was saved from, and this tag is the second argument passed to map_location. The builtin location tags are 'cpu' for CPU tensors and 'cuda:device_id' (e.g. 'cuda:2') for CUDA tensors. map_location should return either None or a storage. If map_location returns a storage, it will be used as the final deserialized object, already moved to the right device. Otherwise, torch.load() will fall back to the default behavior, as if map_location wasn’t specified.

If map_location is a torch.device object or a string containing a device tag, it indicates the location where all tensors should be loaded.

Otherwise, if map_location is a dict, it will be used to remap location tags appearing in the file (keys), to ones that specify where to put the storages (values).

like image 136
ted Avatar answered May 06 '26 23:05

ted



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!