Background
My question is based off an example from Hands-On Machine Learning by Geron, Chapter 12: Custom Models.
The purpose of this example is to create a custom neural network model. The model has 5 Dense
hidden layers. The custom part is that we add a reconstruction
layer before the output. The purpose of the reconstruction layer is to reconstruct the inputs. Then we take the difference reconstruction-inputs
, get the MSE, and apply this value to the loss function. It's supposed to be a regularization step.
Minimum (should be) Working Example
The following code is almost directly from the textbook, but it doesn't work.
import numpy as np
num_training=10;
num_dim=2;
X = np.random.random((10,2))
y = np.random.random(10)
import tensorflow as tf
import tensorflow.keras as keras
class ReconstructingRegressor(keras.models.Model):
def __init__(self, output_dim, **kwargs):
super().__init__(**kwargs)
self.hidden = [keras.layers.Dense(30, activation="selu",
kernel_initializer="lecun_normal")
for _ in range(5)]
self.out = keras.layers.Dense(output_dim)
def build(self, batch_input_shape):
n_inputs = batch_input_shape[-1]
self.reconstruct = keras.layers.Dense(n_inputs)
super().build(batch_input_shape)
def call(self, inputs, training=None):
Z = inputs
for layer in self.hidden:
Z = layer(Z)
reconstruction = self.reconstruct(Z)
recon_loss = tf.reduce_mean(tf.square(reconstruction - inputs))
self.add_loss(0.05 * recon_loss)
return self.out(Z)
model = ReconstructingRegressor(1)
model.compile(loss="mse", optimizer="nadam")
history = model.fit(X, y, epochs=2)
Error Message
However, I get the following error while calling model.fit()
:
---------------------------------------------------------------------------
InaccessibleTensorError Traceback (most recent call last)
<ipython-input-10-b7211d3022fa> in <module>
34 model = ReconstructingRegressor(1)
35 model.compile(loss="mse", optimizer="nadam")
---> 36 history = model.fit(X, y, epochs=2)
and, at the end of the error message:
InaccessibleTensorError: The tensor 'Tensor("mul:0", shape=(), dtype=float32)' cannot be accessed here: it is defined in another function or code block. Use return values, explicit Python locals or TensorFlow collections to access it. Defined in: FuncGraph(name=build_graph, id=140602287140624); accessed from: FuncGraph(name=train_function, id=140602287108640).
Troubleshooting
If I comment out the code that computes the loss, i.e.,
#recon_loss = tf.reduce_mean(tf.square(reconstruction - inputs))
#self.add_loss(0.05 * recon_loss)
in call
, but I keep everything else the same, then I get the following warning
WARNING:tensorflow:Gradients do not exist for variables ['dense/kernel:0', 'dense/bias:0'] when minimizing the loss.
Not sure if that's relevant.
I am not 100% sure, but I believe the problem is due to the fact that the loss that you are adding via self.add_loss
is referring to layers that are not used during the calculation of the main loss, and are possibly optimized out of the main graph. Hence when you want to access them, the tensor are inaccessible.
I think the easiest way is to rewrite the network slightly differently:
training
parameter of model.call
to use the reconstructing layer only during training.model.train_step
to still be able to use fit
. (See the guide : Customize what happens in Model.fit).Using the training
argument of model.call
, we set to use the reconstruct
layer only during training, and we make the network return bot the prediction and the reconstruction. When we want to make a prediction however, we return only the prediction.
Overriding train_step
is just here to be able to still use fit
, and not to have to write the training loop from scratch. We don't need to override test_step
in that case, because the use case if fairly simple.
import tensorflow as tf
import tensorflow.keras as keras
import numpy as np
num_training = 10
num_dim = 2
X = np.random.random((10, 2)).astype(np.float32)
y = np.random.random((10,)).astype(np.float32)
class ReconstructingRegressor(keras.models.Model):
def __init__(self, output_dim, **kwargs):
super().__init__(**kwargs)
self.hidden = [
keras.layers.Dense(
30,
activation="selu",
kernel_initializer="lecun_normal",
name=f"hidden_{idx}",
)
for idx in range(5)
]
self.out = keras.layers.Dense(output_dim, name="output")
def build(self, batch_input_shape):
n_inputs = batch_input_shape[-1]
self.reconstruct = keras.layers.Dense(n_inputs, name="reconstruct")
super().build(batch_input_shape)
@staticmethod
def reconstruction_loss(reconstruction, inputs, rate=0.05):
return tf.reduce_mean(tf.square(reconstruction - inputs)) * rate
def train_step(self, data):
x, y = data
with tf.GradientTape() as tape:
y_pred, recon = self(x, training=True)
loss = self.compiled_loss(y, y_pred)
loss += self.reconstruction_loss(recon, x)
gradients = tape.gradient(loss, self.trainable_variables)
# Update weights
self.optimizer.apply_gradients(zip(gradients, self.trainable_variables))
# Update metrics (includes the metric that tracks the loss)
self.compiled_metrics.update_state(y, y_pred)
# Return a dict mapping metric names to current value
return {m.name: m.result() for m in self.metrics}
def call(self, inputs, training=None):
Z = inputs
for layer in self.hidden:
Z = layer(Z)
if training:
return self.out(Z), self.reconstruct(Z)
return self.out(Z)
model = ReconstructingRegressor(1)
model.compile(optimizer="nadam", loss="mse")
history = model.fit(X, y, epochs=10)
history = model.evaluate(X, y)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With