Understanding gradient computation using backward() in PyTorch

Question

I'm trying to understand the basic pytorch autograd system:

x = torch.tensor(10., requires_grad=True)
print('tensor:',x)
x.backward()
print('gradient:',x.grad)

output:

tensor: tensor(10., requires_grad=True)
gradient: tensor(1.)

since x is a scalar constant and no function is applied to it, I expected 0. as the gradient output. Why is the gradient 1. instead?

flawr · Accepted Answer

Whenever you are using value.backward(), you compute the derivative value (in your case value == x) with respect to all your parameters (in your case that is just x). Roughly speaking, this means all tensors that are somehow involved in your computation that have requires_grad=True. So this means

x.grad = dx / dx = 1

To add to that: With the automatic differentiation you always ever compute with "constant" values: All your function or networks are always evaluated at a concrete point. And the gradient you get is the gradient evaluated at that same point. There is no symbolic computation taking place. All the information needed for the computation of the gradient is encoded in the computation graph.

Understanding gradient computation using backward() in PyTorch

Tags:

python

pytorch

torch

gradient-descent

autograd

volperossa

1 Answers

flawr

Recent Activity

Donate For Us

Understanding gradient computation using backward() in PyTorch

Tags:

python

pytorch

torch

gradient-descent

autograd

volperossa

1 Answers

flawr

Related questions

Recent Activity

Donate For Us