I am using the autograd tool in PyTorch, and have found myself in a situation where I need to access the values in a 1D tensor by means of an integer index. Something like this:
def basic_fun(x_cloned):
res = []
for i in range(len(x)):
res.append(x_cloned[i] * x_cloned[i])
print(res)
return Variable(torch.FloatTensor(res))
def get_grad(inp, grad_var):
A = basic_fun(inp)
A.backward()
return grad_var.grad
x = Variable(torch.FloatTensor([1, 2, 3, 4, 5]), requires_grad=True)
x_cloned = x.clone()
print(get_grad(x_cloned, x))
I am getting the following error message:
[tensor(1., grad_fn=<ThMulBackward>), tensor(4., grad_fn=<ThMulBackward>), tensor(9., grad_fn=<ThMulBackward>), tensor(16., grad_fn=<ThMulBackward>), tensor(25., grad_fn=<ThMulBackward>)]
Traceback (most recent call last):
File "/home/mhy/projects/pytorch-optim/predict.py", line 74, in <module>
print(get_grad(x_cloned, x))
File "/home/mhy/projects/pytorch-optim/predict.py", line 68, in get_grad
A.backward()
File "/home/mhy/.local/lib/python3.5/site-packages/torch/tensor.py", line 93, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/mhy/.local/lib/python3.5/site-packages/torch/autograd/__init__.py", line 90, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
I am in general, a bit skeptical about how using the cloned version of a variable is supposed to keep that variable in gradient computation. The variable itself is effectively not used in the computation of A, and so when you call A.backward(), it should not be part of that operation.
I appreciate your help with this approach or if there is a better way to avoid losing the gradient history and still index through a 1D tensor with requires_grad=True!
res is a list of zero-dimensional tensors containing squared values of 1 to 5. To concatenate in a single tensor containing [1.0, 4.0, ..., 25.0], I changed return Variable(torch.FloatTensor(res)) to torch.stack(res, dim=0), which produces tensor([ 1., 4., 9., 16., 25.], grad_fn=<StackBackward>).
However, I am getting this new error, caused by the A.backward() line.
Traceback (most recent call last):
File "<project_path>/playground.py", line 22, in <module>
print(get_grad(x_cloned, x))
File "<project_path>/playground.py", line 16, in get_grad
A.backward()
File "/home/mhy/.local/lib/python3.5/site-packages/torch/tensor.py", line 93, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/mhy/.local/lib/python3.5/site-packages/torch/autograd/__init__.py", line 84, in backward
grad_tensors = _make_grads(tensors, grad_tensors)
File "/home/mhy/.local/lib/python3.5/site-packages/torch/autograd/__init__.py", line 28, in _make_grads
raise RuntimeError("grad can be implicitly created only for scalar outputs")
RuntimeError: grad can be implicitly created only for scalar outputs
grad. Computes and returns the sum of gradients of outputs with respect to the inputs. grad_outputs should be a sequence of length matching output containing the “vector” in vector-Jacobian product, usually the pre-computed gradients w.r.t. each of the outputs.
autograd provides classes and functions implementing automatic differentiation of arbitrary scalar valued functions. It requires minimal changes to the existing code - you only need to declare Tensor s for which gradients should be computed with the requires_grad=True keyword.
So, when we call loss. backward() , the whole graph is differentiated w.r.t. the loss, and all Variables in the graph will have their . grad Variable accumulated with the gradient. For illustration, let us follow a few steps backward: print(loss.
Computes the gradient of current tensor w.r.t. graph leaves. The graph is differentiated using the chain rule. If the tensor is non-scalar (i.e. its data has more than one element) and requires gradient, the function additionally requires specifying gradient .
I changed my basic_fun to the following, which resolved my problem:
def basic_fun(x_cloned):
res = torch.FloatTensor([0])
for i in range(len(x)):
res += x_cloned[i] * x_cloned[i]
return res
This version returns a scalar value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With