Gradient clipping in pytorch has no effect (Gradient exploding still happens)

Question

I have an exploding gradient problem when train the minibatch for 150-200 epochs with batch size = 256 and there’s about 30-60 minibatch (This depends on my specific config). But I have an exploding gradient issues even if I add the code below. enter image description here As you can see this below images, notice that in step about 40k there’s the swing of gradients between ± 20k, 40k and 60k respectively. I don’t know why this happens because i use the clip_grad_value_ above. Also Using the learning rate decay from 0.01 to about 0.008 at step 40k. enter image description here Or do I need to update the weight parameters by myself something like this image But i think optimizer.step() should do the job and the clip_grad_value_ is an inplace operation so i don’t need to take the return value from function. Please correct if i did anything wrong. Thank you very much

Best regards, Mint

Brian Bartoldson · Accepted Answer

Your code looks right, but try using a smaller value for the clip-value argument. Here's the documentation on the clip_grad_value_() function you're using, which shows that each individual term in the gradient is set such that its magnitude does not exceed the clip value.

You have clip value set to 100, so if you have 100 parameters then abs(gradient).sum() can be as large as 10,000 (100*100).

Gradient clipping in pytorch has no effect (Gradient exploding still happens)

Tags:

machine-learning

deep-learning

pytorch

Puntawat Ponglertnapakorn

1 Answers

Brian Bartoldson

Recent Activity

Donate For Us

Gradient clipping in pytorch has no effect (Gradient exploding still happens)

Tags:

machine-learning

deep-learning

pytorch

Puntawat Ponglertnapakorn

1 Answers

Brian Bartoldson

Related questions

Recent Activity

Donate For Us