Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding gradient computation using backward() in PyTorch

I'm trying to understand the basic pytorch autograd system:

x = torch.tensor(10., requires_grad=True)
print('tensor:',x)
x.backward()
print('gradient:',x.grad)

output:

tensor: tensor(10., requires_grad=True)
gradient: tensor(1.)

since x is a scalar constant and no function is applied to it, I expected 0. as the gradient output. Why is the gradient 1. instead?

like image 643
volperossa Avatar asked Nov 01 '25 10:11

volperossa


1 Answers

Whenever you are using value.backward(), you compute the derivative value (in your case value == x) with respect to all your parameters (in your case that is just x). Roughly speaking, this means all tensors that are somehow involved in your computation that have requires_grad=True. So this means

x.grad = dx / dx = 1

To add to that: With the automatic differentiation you always ever compute with "constant" values: All your function or networks are always evaluated at a concrete point. And the gradient you get is the gradient evaluated at that same point. There is no symbolic computation taking place. All the information needed for the computation of the gradient is encoded in the computation graph.

like image 165
flawr Avatar answered Nov 02 '25 23:11

flawr



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!