Pytorch- why is “accumulating” the default mode of .gradient?

Question

Why didn't the authors just make it overwrite the gradient? Is there any specific reason for keeping it accumulated?

Umang Gupta · Accepted Answer

Because if you use the same network twice (or same weights) in the forward pass, it should accumulate and not override. Also, since pytorch computation graph is defined by the run, so it makes sense to accumulate. See https://discuss.pytorch.org/t/why-do-we-need-to-set-the-gradients-manually-to-zero-in-pytorch/4903/9

Pytorch- why is “accumulating” the default mode of .gradient?

Tags:

pytorch

aerin

1 Answers

Umang Gupta

Recent Activity

Donate For Us

Pytorch- why is “accumulating” the default mode of .gradient?

Tags:

pytorch

aerin

1 Answers

Umang Gupta

Related questions

Recent Activity

Donate For Us