When i run a Neural Network (without BatchNormalization) in Keras, I understand how the get_weights() function provides the weights and bias of the NN. However with BatchNorm it produces 4 extra parameters, I assume Gamma, Beta, Mean & Std.
I have tried to replicate a simple NN manually when i save these values, and cant get them to produce the right output. Does anyone know how these values work?
No Batch Norm
With Batch Norm
I will take an example to explain get_weights() in case of simple Multi Layer Perceptron (MLP) and MLP with Batch Normalization(BN).
Example: Say we are working on MNIST dataset, and using 2 layer MLP architecture (i.e. 2 hidden layers). No. of neurons in hidden layer 1 is 392 and No. of neurons in hidden layer 2 is 196. So the final architecture for our MLP will be 784 x 512 x 196 x 10
Here 784 is the input image dimension and 10 is the output layer dimension
Case1: MLP without Batch Normalization => Let my model name is model_relu that uses ReLU activation function. Now after training model_relu, I am using get_weights(), This will return a list of size 6 as shown in below screen shot.
get_weights() with simple MLP and without Batch Norm And the list values are as below:
(392,): bias associated with weights of hidden layer1
(392, 196): weights for hidden layer2
(196,): bias associated with weights of hidden layer2
(196, 10): weights for output layer
Case2: MLP with Batch Normalization => Let my model name is model_batch that also uses ReLU activation function along with Batch Normalization. Now after training model_batch I am using get_weights(), This will return a list of size 14 as shown in below screen shot.
get_weights() with Batch Norm And the list values are as below:
(392,) (392,) (392,) (392,): these four parameters are gamma, beta, mean and std. dev values of size 392 each associated with Batch Normalization of hidden layer1.
(392, 196): weight for hidden layer2
(196,) (196,) (196,) (196,): these four parameters are gamma, beta, running mean and std. dev of size 196 each associated with Batch Normalization of hidden layer2.
(196, 10): weight for output layer
So, in case2 if you want to get weights for hidden layer1, hidden layer2, and output layer, the python code can be something like this:
wrights = model_batch.get_weights()
hidden_layer1_wt = wrights[0].flatten().reshape(-1,1)
hidden_layer2_wt = wrights[6].flatten().reshape(-1,1)
output_layer_wt = wrights[12].flatten().reshape(-1,1)
Hope this helps!
Ref: keras-BatchNormalization
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With