Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pytorch- why is “accumulating” the default mode of .gradient?

Tags:

pytorch

Why didn't the authors just make it overwrite the gradient? Is there any specific reason for keeping it accumulated?

like image 377
aerin Avatar asked Oct 14 '25 09:10

aerin


1 Answers

Because if you use the same network twice (or same weights) in the forward pass, it should accumulate and not override. Also, since pytorch computation graph is defined by the run, so it makes sense to accumulate. See https://discuss.pytorch.org/t/why-do-we-need-to-set-the-gradients-manually-to-zero-in-pytorch/4903/9

like image 187
Umang Gupta Avatar answered Oct 17 '25 21:10

Umang Gupta



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!