I'm trying to apply a mask (binary, only one channel) to an RGB image (3 channels, normalized to [0, 1]). My current solution is, that I split the RGB image into it's channels, multiply it with the mask and concatenate these channels again:
with tf.variable_scope('apply_mask') as scope:
# Output mask is in range [-1, 1], bring to range [0, 1] first
zero_one_mask = (output_mask + 1) / 2
# Apply mask to all channels.
channels = tf.split(3, 3, output_img)
channels = [tf.mul(c, zero_one_mask) for c in channels]
output_img = tf.concat(3, channels)
However, this seems pretty inefficient, especially since, to my understanding, none of these computations are done in-place. Is there a more efficient way for doing this?
The tf.mul() operator supports numpy-style broadcasting, which would allow you to simplify and optimize the code slightly.
Let's say that zero_one_mask is an m x n tensor, and output_img is a b x m x n x 3 (where b is the batch size - I'm inferring this from the fact that you split output_img on dimension 3)*. You can use tf.expand_dims() to make zero_one_mask broadcastable to channels, by reshaping it to be an m x n x 1 tensor:
with tf.variable_scope('apply_mask') as scope:
# Output mask is in range [-1, 1], bring to range [0, 1] first
# NOTE: Assumes `output_mask` is a 2-D `m x n` tensor.
zero_one_mask = tf.expand_dims((output_mask + 1) / 2, 2)
# Apply mask to all channels.
# NOTE: Assumes `output_img` is a 4-D `b x m x n x c` tensor.
output_img = tf.mul(output_img, zero_one_mask)
(* This would work equally if output_img were a 4-D b x m x n x c (for any number of channels c) or 3-D m x n x c tensor, due to the way broadcasting works.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With