Error on prediction running keras multi_gpu_model

Question

I've an issue running a Keras model on a Google Cloud Platform instance.
The model is the following:

n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]

train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1))

verbose, epochs, batch_size = 1, 1, 64  # low number of epochs just for testing purpose
with tf.device('/cpu:0'):
    m = Sequential()
    m.add(CuDNNLSTM(20, input_shape=(n_timesteps, n_features)))
    m.add(LeakyReLU(alpha=0.1))
    m.add(RepeatVector(n_outputs))
    m.add(CuDNNLSTM(20, return_sequences=True))
    m.add(LeakyReLU(alpha=0.1))
    m.add(TimeDistributed(Dense(20)))
    m.add(LeakyReLU(alpha=0.1))
    m.add(TimeDistributed(Dense(1)))

self.model = multi_gpu_model(m, gpus=8)
self.model.compile(loss='mse', optimizer='adam')

self.model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose)

As you can see from the code above, I run the model on machine with 8 GPUs (Nvidia Tesla K80).
Train works well, without any errors. However, the prediction fails and returns the following error:

W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at cudnn_rnn_ops.cc:1336 : Unknown: CUDNN_STATUS_BAD_PARAM in tensorflow/stream_executor/cuda/cuda_dnn.cc(1285): 'cudnnSetTensorNdDescriptor( tensor_desc.get(), data_type, sizeof(dims) / sizeof(dims[0]), dims, strides)'

Here the code to run the prediction:

self.model.predict(input_x)

What I've noticed is that if I remove the code for multi-GPU data parallelism, the code works well using a single GPU.
To be more precise, if I comment this line, the code works without error

self.model = multi_gpu_model(m, gpus=8)

What am I missing?

virtualenv information

cudatoolkit - 10.0.130
cudnn - 7.6.4
keras - 2.2.4
keras-applications - 1.0.8
keras-base - 2.2.4
keras-gpu - 2.2.4
python - 3.6

UPDATE

train_x.shape = (1441, 288, 1)
train_y.shape = (1441, 288, 1)
input_x.shape = (1, 288, 1)

After Olivier Dehaene's reply I tried his suggestion and it worked.
I tried to modify the input_x shape in order to obtain (8, 288, 1).
In order to do that I also modified train_x and train_y shapes.
Here a recap:

train_x.shape = (8065, 288, 1)
train_y.shape = (8065, 288, 1)
input_x.shape = (8, 288, 1)

But now I've the same error on the training phase, on this line:

self.model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose)

Olivier Dehaene · Accepted Answer

From the tf.keras.utils.multi_gpu_model we can see that it works in the following way:

Divide the model's input(s) into multiple sub-batches.

Apply a model copy on each sub-batch. Every model copy is executed on a dedicated GPU.

Concatenate the results (on CPU) into one big batch.

You are triggering an error because the input of the CuDNNLSTM layer is empty for at least one of the model copy. This is because the divide operations requires that: input // n_gpus > 0

Try this code out:

input_x = np.random.randn(8, n_timesteps, n_features)
model.predict(input_x)

Error on prediction running keras multi_gpu_model

Tags:

python

python-3.x

neural-network

tensorflow

keras

Giordano

1 Answers

Olivier Dehaene

Recent Activity

Donate For Us

Error on prediction running keras multi_gpu_model

Tags:

python

python-3.x

neural-network

tensorflow

keras

Giordano

1 Answers

Olivier Dehaene

Related questions

Recent Activity

Donate For Us