Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Applying convolution operation to image - PyTorch

Tags:

python

pytorch

To render an image if shape 27x35 I use :

random_image = []
for x in range(1 , 946):
        random_image.append(random.randint(0 , 255))

random_image_arr = np.array(random_image)
matplotlib.pyplot.imshow(random_image_arr.reshape(27 , 35))

This generates :

enter image description here

I then try to apply a convolution to the image using the torch.nn.Conv2d :

conv2 = torch.nn.Conv2d(3, 18, kernel_size=3, stride=1, padding=1)

image_d = np.asarray(random_image_arr.reshape(27 , 35))

conv2(torch.from_numpy(image_d))

But this displays error :

~/.local/lib/python3.6/site-packages/torch/nn/modules/conv.py in forward(self, input)
    299     def forward(self, input):
    300         return F.conv2d(input, self.weight, self.bias, self.stride,
--> 301                         self.padding, self.dilation, self.groups)
    302 
    303 

RuntimeError: input has less dimensions than expected

The shape of the input image_d is (27, 35)

Should I change the parameters of Conv2d in order to apply the convolution to the image ?

Update. From @McLawrence answer I have :

random_image = []
for x in range(1 , 946):
        random_image.append(random.randint(0 , 255))

random_image_arr = np.array(random_image)
matplotlib.pyplot.imshow(random_image_arr.reshape(27 , 35))

This renders image :

enter image description here

Applying the convolution operation :

conv2 = torch.nn.Conv2d(1, 18, kernel_size=3, stride=1, padding=1)

image_d = torch.FloatTensor(np.asarray(random_image_arr.reshape(1, 1, 27 , 35))).numpy()

fc = conv2(torch.from_numpy(image_d))

matplotlib.pyplot.imshow(fc[0][0].data.numpy())

renders image :

enter image description here

like image 425
blue-sky Avatar asked Oct 31 '25 04:10

blue-sky


1 Answers

There are two problems with your code:

First, 2d convolutions in pytorch are defined only for 4d tensors. This is convenient for use in neural networks. The first dimension is the batch size while the second dimension are the channels (a RGB image for example has three channels). So you have to reshape your tensor like

image_d = torch.FloatTensor(np.asarray(random_image_arr.reshape(1, 1, 27 , 35)))

The FloatTensoris important here, since convolutions are not defined on the LongTensor which will be created automatically if your numpy array only includes ints.

Secondly, You have created a convolution with three input channels, while your image has just one channel (it is greyscale). So you have to adjust the convolution to:

conv2 = torch.nn.Conv2d(1, 18, kernel_size=3, stride=1, padding=1)
like image 60
McLawrence Avatar answered Nov 02 '25 18:11

McLawrence



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!