Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use tf.Dataset with TIFF files in image segmentation?

I have two sets of files: masks and images. There is no tiff decoder in 'tensorflow', but there is 'tfio.experimental'. Tiff files have more than 4 channels.

this code doesnt work:

    import numpy as np
    import tiffile as tiff
    import tensorflow as tf

    for i in range(100):
      a = np.random.random((30, 30, 8))
      b = np.random.randint(10, size = (30, 30, 8))
      tiff.imsave('new1//images'+str(i)+'.tif', a)
      tiff.imsave('new2//images'+str(i)+'.tif', b)

    import glob
    paths1 = glob.glob('new1//*.*')
    paths2 = glob.glob('new2//*.*')

    def load(image_file, mask_file):
      image = tf.io.read_file(image_file)
      image = tfio.experimental.image.decode_tiff(image)

      mask = tf.io.read_file(mask_file)
      mask = tfio.experimental.image.decode_tiff(mask)

      input_image = tf.cast(image, tf.float32)
      mask_image = tf.cast(mask, tf.uint8)
      return input_image, mask_image

    AUTO = tf.data.experimental.AUTOTUNE
    BATCH_SIZE = 32

    dataloader = tf.data.Dataset.from_tensor_slices((paths1, paths2))

    dataloader = (
    dataloader
    .shuffle(1024)
    .map(load, num_parallel_calls=AUTO)
    .batch(BATCH_SIZE)
    .prefetch(AUTO)
    )

it is impossible to keep entire dataset in the memory, saving to numpy arrays also gives no easy solution. Although code provided above gives no error directly. But shape of images is (None, None, None)

'model.fit' gives error

Is there alternative way to save arrays? I only see bruteforce solution with manual feeding random batches during custom training.

like image 232
XMaSt3R Avatar asked Oct 30 '25 21:10

XMaSt3R


1 Answers

I found solution for my question: DataGenerator allows to work with any files

class Gen(tf.keras.utils.Sequence):

    def __init__(self, x_set, y_set, batch_size):
        self.x, self.y = x_set, y_set
        self.batch_size = batch_size

    def __len__(self):
        return math.ceil(len(self.x) / self.batch_size)

    def __getitem__(self, idx):
        batch_x = self.x[idx * self.batch_size:(idx + 1) *
        self.batch_size]
        batch_y = self.y[idx * self.batch_size:(idx + 1) *
        self.batch_size]

        return np.array([
            tiff.imread(file_name_x)
               for file_name_x in batch_x]), np.array([
            tiff.imread(file_name_y)
               for file_name_y in batch_y])

It works anyway without any problem

like image 56
XMaSt3R Avatar answered Nov 01 '25 11:11

XMaSt3R



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!