<p>Is it possible to fix the seed for <code>torch.utils.data.random_split()</code> when splitting a dataset so that it is possible to reproduce the test results?</p>

<p>As you can see from the documentation is possible to pass a generator to random_split</p> <pre class="prettyprint"><code>random_split(range(10), [3, 7], generator=torch.Generator().manual_seed(42)) </code></pre>

Fixing the seed for torch random_split()

2 Answers

You can use torch.manual_seed function to seed the script globally:

import torch
torch.manual_seed(0)

See reproducibility documentation for more information.

If you want to specifically seed torch.utils.data.random_split you could "reset" the seed to it's initial value afterwards. Simply use torch.initial_seed() like this:

torch.manual_seed(torch.initial_seed())

AFAIK pytorch does not provide arguments like seed or random_state (which could be seen in sklearn for example).

answered Sep 23 '22 18:09

Szymon Maszke

As you can see from the documentation is possible to pass a generator to random_split

random_split(range(10), [3, 7], generator=torch.Generator().manual_seed(42))

answered Sep 26 '22 18:09

Matteo Pennisi

Related questions
                            
                                NLP Transformers: Best way to get a fixed sentence embedding-vector shape?
                            
                                Why PyTorch model takes multiple image size inside the model?
                            
                                Windows keeps crashing when trying to install PyTorch via pip
                            
                                How to change the picture size in PyTorch
                            
                                PyTorch - How to use "toPILImage" correctly
                            
                                RuntimeError: Attempting to deserialize object on CUDA device 2 but torch.cuda.device_count() is 1
                            
                                In language modeling, why do I have to init_hidden weights before every new epoch of training? (pytorch)
                            
                                How do I add some Gaussian noise to a tensor in PyTorch?
                            
                                pytorch data loader multiple iterations
                            
                                How do I turn a Pytorch Dataloader into a numpy array to display image data with matplotlib?
                            
                                LSTM autoencoder always returns the average of the input sequence
                            
                                Pytorch: How can I find indices of first nonzero element in each row of a 2D tensor?
                            
                                PyTorch flatten doesn't maintain batch size
                            
                                tensorboard colab tensorflow._api.v1.io.gfile' has no attribute 'get_filesystem
                            
                                No matching distribution found for torch==1.5.0+cpu on Heroku
                            
                                Why is the super constructor necessary in PyTorch custom modules?
                            
                                How to use CUDA stream in Pytorch?
                            
                                AttributeError: 'tuple' object has no attribute 'log_softmax'
                            
                                Unable to find a valid cuDNN algorithm to run convolution
                            
                                How to visualise filters in a CNN with PyTorch

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Fixing the seed for torch random_split()

Tags:

pytorch

torch

cerebrou

People also ask

2 Answers

Szymon Maszke

Matteo Pennisi

Recent Activity

Donate For Us