PyTorch ToTensor Changes C x H x W (5 x 600 x 900) to H x C x W (900 x 5 x 600)

Question

Here is my DataLoader. When I use ToTensor, it changes the dimensions of the image to H x C x W. Is permute okay to fix this or this might change some orientation?

class DPWHDataset(Dataset):
  def __init__(self, mean=None, std=None, phase=None, dataset=None):
    self.data = dataset
    self.mean = mean
    self.std = std
    self.phase = phase
    self.transforms = get_transforms(phase, mean, std)

  def __len__(self):
    return len(self.data)

  def __getitem__(self, idx):
    image_name = self.data[idx]

    image_path = image_prefix + image_name + ".jpg"
    mask_path = binary_mask_prefix + image_name + "_mask.png"

    mask = cv2.imread(mask_path, 0)
    print(image_path)

    # image = np.array(Image.open(image_path))
    # mask = np.array(Image.open(mask_path))
    image = cv2.imread(image_path)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

    mask = create_channel_mask(mask)
    # augmented = self.transforms(image=image, mask=mask)
    # image = augmented['image']
    # mask = augmented['mask']

    image = torchvision.transforms.ToTensor()(image)
    image = torchvision.transforms.Normalize(mean=self.mean, std=self.std)(image)
    mask = torchvision.transforms.ToTensor()(mask)
    return image, mask

Wasi Ahmad · Accepted Answer

According to the documentation, torchvision.transforms.ToTensor converts a PIL Image or numpy.ndarray (H x W x C) to a torch.FloatTensor of shape (C x H x W). So, in the following line:

image = torchvision.transforms.ToTensor()(image)

The resultant image tensor is of shape (C x H x W) and the input tensor is of shape (H x W x C). You can verify this by printing the tensor shapes.

And yes, you can adjust the shape using torch.permute, it won't cause any issue.

PyTorch ToTensor Changes C x H x W (5 x 600 x 900) to H x C x W (900 x 5 x 600)

Tags:

image-processing

pytorch

torchvision

unsure_automata

1 Answers

Wasi Ahmad

Recent Activity

Donate For Us

PyTorch ToTensor Changes C x H x W (5 x 600 x 900) to H x C x W (900 x 5 x 600)

Tags:

image-processing

pytorch

torchvision

unsure_automata

1 Answers

Wasi Ahmad

Related questions

Recent Activity

Donate For Us