I just posted about another problem with the same code, but progress is extremely slow due to the fact that I know very little about what I'm doing. The link to the previous problem is here: Keras ValueError: No gradients provided for any variable
I'm currently trying to get my model to run in order to classify 5000 different events which are 2D numpy arrays of 29x29 values
I define my NN like so:
inputs = keras.Input(shape=(29,29,1))
x=inputs
x = keras.layers.Conv2D(16, kernel_size=(3,3), name='Conv_1')(x)
x = keras.layers.LeakyReLU(0.1)(x)
x = keras.layers.MaxPool2D((2,2), name='MaxPool_1')(x)
x = keras.layers.Conv2D(16, kernel_size=(3,3), name='Conv_2')(x)
x = keras.layers.LeakyReLU(0.1)(x)
x = keras.layers.MaxPool2D((2,2), name='MaxPool_2')(x)
x = keras.layers.Conv2D(32, kernel_size=(3,3), name='Conv_3')(x)
x = keras.layers.LeakyReLU(0.1)(x)
x = keras.layers.MaxPool2D((2,2), name='MaxPool_3')(x)
x = keras.layers.Flatten(name='Flatten')(x)
x = keras.layers.Dense(64, name='Dense_1')(x)
x = keras.layers.ReLU(name='ReLU_dense_1')(x)
x = keras.layers.Dense(64, name='Dense_2')(x)
x = keras.layers.ReLU(name='ReLU_dense_2')(x)
outputs = keras.layers.Dense(4, activation='softmax', name='Output')(x)
model = keras.Model(inputs=inputs, outputs=outputs, name='VGGlike_CNN')
model.summary()
keras.utils.plot_model(model, show_shapes=True)
OPTIMIZER = tf.keras.optimizers.Adam(learning_rate=LR_ST)
model.compile(optimizer=OPTIMIZER,
loss='categorical_crossentropy',
metrics=['accuracy'],
run_eagerly=False)
def lr_decay(epoch):
if epoch < 10:
return LR_ST
else:
return LR_ST * tf.math.exp(0.2 * (10 - epoch))
lr_scheduler = keras.callbacks.LearningRateScheduler(lr_decay)
model_checkpoint = keras.callbacks.ModelCheckpoint(
filepath='mycnn_best',
monitor='val_accuracy',
save_weights_only=True,
save_best_only=True,
save_freq='epoch')
callbacks = [ lr_scheduler, model_checkpoint ]
print('X_train.shape = ',X_train.shape)
history = model.fit(X_train, Y_train epochs=50,
validation_data=X_test, shuffle=True, verbose=1,
callbacks=callbacks)
It now gives me the error: ValueError: Shapes (32, 2) and (32, 4) are incompatible.
I want to classify each of the events has having 1,2,3 or 4 clusters, but before working on something complex, I'm using events which I know only have 1 cluster, so the label for each event is 1.
All of this gives me the idea that the problem is to do with my output being 4 neurons, but I really don't know if that's true, nor do I know how to go about debugging the code.
If anyone could help me I'd be really grateful.
The issue comes from the difference between the shape of your labels and the output shape of your model. Since you are using categorical_crossentropy and there are 4 units for your output layer, your model expects labels in one hot encoded form and as a vector of length 4. However, your labels are vectors of length 2. Therefore, if your labels are integers, you can do
Y_train = tf.one_hot(Y_train, 4)
and the resulting shape will be (5000, 4).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With