Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Runtime Error while Saving a PyTorch Model: "File /path/to/be/saved Cannot Be Opened"

Tags:

python

pytorch

I run a CNN model on CIFAR-10 using PyTorch and use the official PyTorch tutorial to save a general checkpoint.

When the training and testing is completed I pass the last epoch to this save_model function.

def save_model(epoch):
    torch.save({
        'epoch': epoch+1,
        'model_state_dict': net.state_dict(),
        'optimizer_state_dict': optimizer.state_dict(),
        }, '/home/cc/research/AdderNet/pretrained/minionn.pt')

However, I keep getting the following error while trying to save the model:

> Train - Epoch 1, Batch: 1, Loss: 2.302385
> Test Avg. Loss: 0.020081, Accuracy: 0.269100
> Train - Epoch 2, Batch: 1, Loss: 2.019350
> Test Avg. Loss: 0.018918, Accuracy: 0.324800
> Traceback (most recent call last):
> File "/home/cc/research/AdderNet/main.py", line 119, in <module>
> main()
> File "/home/cc/research/AdderNet/main.py", line 115, in main
> save_model(epoch)
> File "/home/cc/research/AdderNet/main.py", line 105, in save_model
> torch.save({
> File "/home/cc/anaconda3/envs/torch/lib/python3.10/site-packages/torch/serialization.py", line 422, in save
> with _open_zipfile_writer(f) as opened_zipfile:
> File "/home/cc/anaconda3/envs/torch/lib/python3.10/site-packages/torch/serialization.py", line 309, in _open_zipfile_writer
> return container(name_or_buffer)
> File "/home/cc/anaconda3/envs/torch/lib/python3.10/site-packages/torch/serialization.py", line 287, in __init__
> super(_open_zipfile_writer_file, self).__init__(torch._C.PyTorchFileWriter(str(name)))
> **RuntimeError: File /home/cc/research/AdderNet/pretrained/minionn.pt cannot be opened.**

What do you think is the problem? Please, let me know if any other details need to be added. I am running my code on a remote server using VsCode. I am also using a virtual environment that I created with conda. The python version installed in the venv is Python 3.10.8 but the conda's base python version is Python 3.9.13 and my system's default python version (when I deactivate conda) is Python 3.8.10. The operating system is also Ubuntu20.04.

Updated:

I am able to save the model using the following:

torch.save(model, '/home/cc/research/AdderNet/pretrained/FILE_NAME')

But since I want to load and continue training the saved model, PyTorch instructs to use this approach which apparently doesn't work for me:

torch.save({
            'epoch': EPOCH,
            'model_state_dict': net.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'loss': LOSS,
            }, '/home/cc/research/AdderNet/pretrained/FILE_NAME.pt')
like image 404
Al A Avatar asked Mar 18 '26 02:03

Al A


1 Answers

I have faced the same problem on windows machine. I was trying to save the model with timestamp

model_path = os.path.join(model_dir, str(datetime.datetime.now())+".pth")
torch.save(self.state_dict(), model_path)

It was throwing the same error once I removed the timestamp it worked for me

model_path = os.path.join(model_dir, "model"+".pth")
torch.save(self.state_dict(), model_path)

do not know the reason but it worked for me.

like image 99
SOUVIK PAL Avatar answered Mar 20 '26 14:03

SOUVIK PAL



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!