I would like to use TensorFlow's eager execution functionality to optimize the components of a vector. In all the documented examples, each trainable variable is just a scalar, with collections represented by lists of these. However, the loss function I have in mind involves performing vector manipulations on those components, and so this is inconvenient.
For example, let us use the Adam optimizer to normalize a 3-component vector:
import tensorflow as tf
import tensorflow.contrib.eager as tfe
import numpy as np
tf.enable_eager_execution()
def normalize(din=[2.0,1.0,0.0], lr=0.001,
nsteps=100):
d = tfe.Variable(din)
def loss(dvec):
return tf.sqrt((1.0 - tf.tensordot(dvec, dvec, 1))**2)
def grad(dvec):
with tf.GradientTape() as tape:
loss_val = loss(dvec)
return tape.gradient(loss_val, dvec)
optimizer = tf.train.AdamOptimizer(learning_rate=lr)
for i in range(nsteps):
grads = grad(d)
optimizer.apply_gradients(zip(grads, d)) #Throws error
return d
This code correctly computes the required gradients. However, the "optimizer.apply_gradients" line throws some kind of error, seemingly no matter what I do, essentially because tfe.Variable is not an iterable.
In this specific example the error is "AttributeError: Tensor.name is meaningless when eager execution is enabled". We could also try, for example,
zip(grads, [d[i] for i in range(3)])
instead of d, but then the interpreter complains that d is not iterable.
What is the correct way to pair grads with d?
Optimizer.apply_gradients requires its first argument to be a list of (gradient, variable) pairs.
In the code above, neither grads nor d is a list (try print(type(grads)) for example), so the error is from the call to zip. I think what you want instead is:
optimizer.apply_gradients(zip([grads], [d]))
Or, more simply:
optimizer.apply_gradients([(grads, d)])
Also, FYI, as eager execution is stabilizing more things are moving out of the experimental "contrib" namespace, so you don't need the tfe module for your example (tf.Variable will work just fine) if you're using a recent version of TensorFlow (1.11, 1.12 etc.). Making your whole program look like:
import tensorflow as tf
import numpy as np
tf.enable_eager_execution()
def normalize(din=[2.0,1.0,0.0], lr=0.001,
nsteps=100):
d = tf.Variable(din)
def loss(dvec):
return tf.sqrt((1.0 - tf.tensordot(dvec, dvec, 1))**2)
def grad(dvec):
with tf.GradientTape() as tape:
loss_val = loss(dvec)
return tape.gradient(loss_val, dvec)
optimizer = tf.train.AdamOptimizer(learning_rate=lr)
for i in range(nsteps):
dd = grad(d)
optimizer.apply_gradients([(dd, d)])
return d
Hope that helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With