I've read the tf.scatter_nd documentation and run the example code for 1D and 3D tensors... and now I'm trying to do it for a 2D tensor. I want to 'interleave' the columns of two tensors. For 1D tensors, one can do this via
'''
We want to interleave elements of 1D tensors arr1 and arr2, where
arr1 = [10, 11, 12]
arr2 = [1, 2, 3, 4, 5, 6]
such that
desired result = [1, 2, 10, 3, 4, 11, 5, 6, 12]
'''
import tensorflow as tf
with tf.Session() as sess:
updates1 = tf.constant([1,2,3,4,5,6])
indices1 = tf.constant([[0], [1], [3], [4], [6], [7]])
shape = tf.constant([9])
scatter1 = tf.scatter_nd(indices1, updates1, shape)
updates2 = tf.constant([10,11,12])
indices2 = tf.constant([[2], [5], [8]])
scatter2 = tf.scatter_nd(indices2, updates2, shape)
result = scatter1 + scatter2
print(sess.run(result))
(aside: is there a better way to do this? I'm all ears.)
This gives the output
[ 1 2 10 3 4 11 5 6 12]
Yay! that worked!
Now lets' try to extend this to 2D.
'''
We want to interleave the *columns* (not rows; rows would be easy!) of
arr1 = [[1,2,3,4,5,6],[1,2,3,4,5,6],[1,2,3,4,5,6]]
arr2 = [[10 11 12], [10 11 12], [10 11 12]]
such that
desired result = [[1,2,10,3,4,11,5,6,12],[1,2,10,3,4,11,5,6,12],[1,2,10,3,4,11,5,6,12]]
'''
updates1 = tf.constant([[1,2,3,4,5,6],[1,2,3,4,5,6],[1,2,3,4,5,6]])
indices1 = tf.constant([[0], [1], [3], [4], [6], [7]])
shape = tf.constant([3, 9])
scatter1 = tf.scatter_nd(indices1, updates1, shape)
This gives the error
ValueError: The outer 1 dimensions of indices.shape=[6,1] must match the outer 1
dimensions of updates.shape=[3,6]: Dimension 0 in both shapes must be equal, but
are 6 and 3. Shapes are [6] and [3]. for 'ScatterNd_2' (op: 'ScatterNd') with
input shapes: [6,1], [3,6], [2].
Seems like my indices is specifying row indices instead of column indices, and given the way that arrays are "connected" in numpy and tensorflow (i.e. row-major order), does that mean
I need to explicitly specify every single pair of indices for every element in updates1?
Or is there some kind of 'wildcard' specification I can use for the rows? (Note indices1 = tf.constant([[:,0], [:,1], [:,3], [:,4], [:,6], [:,7]]) gives syntax errors, as it probably should.)
Would it be easier to just do a transpose, interleave the rows, then transpose back? Because I tried that...
scatter1 = tf.scatter_nd(indices1, tf.transpose(updates1), tf.transpose(shape))
print(sess.run(tf.transpose(scatter1)))
...and got a much longer error message, that I don't feel like posting unless someone requests it.
PS- I searched to make sure this isn't a duplicate -- I find it hard to imagine that someone else hasn't asked this before -- but turned up nothing.
This is pure slicing but I didn't know that syntax like arr1[0:,:][:,:2] actually works. It seems it does but not sure if it is better.
This may be the wildcard slicing mechanism you are looking for.
arr1 = tf.constant([[1,2,3,4,5,6],[1,2,3,4,5,7],[1,2,3,4,5,8]])
arr2 = tf.constant([[10, 11, 12], [10, 11, 12], [10, 11, 12]])
with tf.Session() as sess :
sess.run( tf.global_variables_initializer() )
print(sess.run(tf.concat([arr1[0:,:][:,:2], arr2[0:,:] [:,:1],
arr1[0:,:][:,2:4],arr2[0:, :][:, 1:2],
arr1[0:,:][:,4:6],arr2[0:, :][:, 2:3]],axis=1)))
Output is
[[ 1 2 10 3 4 11 5 6 12]
[ 1 2 10 3 4 11 5 7 12]
[ 1 2 10 3 4 11 5 8 12]]
So, for example,
arr1[0:,:] returns
[[1 2 3 4 5 6]
[1 2 3 4 5 7]
[1 2 3 4 5 8]]
and arr1[0:,:][:,:2] returns the first two columns
[[1 2]
[1 2]
[1 2]]
axis is 1.
Some moderators might have regarded my question as a duplicate of this one, not because the questions are the same, but only because the answers contain parts one can use to answer this question -- i.e. specifying every index combination by hand.
A totally different method would be to multiply by a permutation matrix as shown in the last answer to this question. Since my original question was about scatter_nd, I'm going to post this solution but wait to see what other answers come in... (Alternatively, I or someone could edit the question to make it about reordering columns, not specific to scatter_nd --EDIT: I have just edited the question title to reflect this).
Here, we concatenate the two different arrays/tensors...
import numpy as np
import tensorflow as tf
sess = tf.Session()
# the ultimate application is for merging variables which should be in groups,
# e.g. in this example, [1,2,10] is a group of 3, and there are 3 groups of 3
n_groups = 3
vars_per_group = 3 # once the single value from arr2 (below) is included
arr1 = 10+tf.range(n_groups, dtype=float)
arr1 = tf.stack((arr1,arr1,arr1),0)
arr2 = 1+tf.range(n_groups * (vars_per_group-1), dtype=float)
arr2 = tf.stack((arr2,arr2,arr2),0)
catted = tf.concat((arr1,arr2),1) # concatenate the two arrays together
print("arr1 = \n",sess.run(arr1))
print("arr2 = \n",sess.run(arr2))
print("catted = \n",sess.run(catted))
Which gives output
arr1 =
[[10. 11. 12.]
[10. 11. 12.]
[10. 11. 12.]]
arr2 =
[[1. 2. 3. 4. 5. 6.]
[1. 2. 3. 4. 5. 6.]
[1. 2. 3. 4. 5. 6.]]
catted =
[[10. 11. 12. 1. 2. 3. 4. 5. 6.]
[10. 11. 12. 1. 2. 3. 4. 5. 6.]
[10. 11. 12. 1. 2. 3. 4. 5. 6.]]
Now we build the permutation matrix and multiply...
start_index = 2 # location of where the interleaving begins
# cml = "column map list" is the list of where each column will get mapped to
cml = [start_index + x*(vars_per_group) for x in range(n_groups)] # first array
for i in range(n_groups): # second array
cml += [x + i*(vars_per_group) for x in range(start_index)] # vars before start_index
cml += [1 + x + i*(vars_per_group) + start_index \
for x in range(vars_per_group-start_index-1)] # vars after start_index
print("\n cml = ",cml,"\n")
# Create a permutation matrix using p
np_perm_mat = np.zeros((len(cml), len(cml)))
for idx, i in enumerate(cml):
np_perm_mat[idx, i] = 1
perm_mat = tf.constant(np_perm_mat,dtype=float)
result = tf.matmul(catted, perm_mat)
print("result = \n",sess.run(result))
Which gives output
cml = [2, 5, 8, 0, 1, 3, 4, 6, 7]
result =
[[ 1. 2. 10. 3. 4. 11. 5. 6. 12.]
[ 1. 2. 10. 3. 4. 11. 5. 6. 12.]
[ 1. 2. 10. 3. 4. 11. 5. 6. 12.]]
Even though this doesn't use scatter_nd as the original question asked, one thing I like about this is, you can allocate the perm_mat once in some __init__() method, and hang on to it, and after that initial overhead it's just matrix-matrix multiplication by a sparse, constant matrix, which should be pretty fast. (?)
Still happy to wait and see what other answers might come in.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With