I know offsets meaning when it has two numbers, but what does it mean when more than two numbers,for example:
weight = torch.FloatTensor([[1, 2, 3], [4, 5, 6]])
embedding_sum = nn.EmbeddingBag.from_pretrained(weight, mode='sum')
print(list(embedding_sum.parameters()))
input = torch.LongTensor([0,1])
offsets = torch.LongTensor([0,1,2,1])
print(embedding_sum(input, offsets))
the result is :
[Parameter containing:
tensor([[1., 2., 3.],
[4., 5., 6.]])]
tensor([[1., 2., 3.],
[4., 5., 6.],
[0., 0., 0.],
[0., 0., 0.]])
who can help me?
As shown in the source code,
return F.embedding(
input, self.weight, self.padding_idx, self.max_norm,
self.norm_type, self.scale_grad_by_freq, self.sparse)
It uses the functional embedding bag, which explains the offsets parameters as
offsets (LongTensor, optional) – Only used when input is 1D. offsets determines the starting index position of each bag (sequence) in input.
In the EmbeddingBag docs:
If input is 1D of shape (N), it will be treated as a concatenation of multiple bags (sequences). offsets is required to be a 1D tensor containing the starting index positions of each bag in input. Therefore, for offsets of shape (B), input will be viewed as having B bags. Empty bags (i.e., having 0-length) will have returned vectors filled by zeros.
The last statement ("Empty bags (i.e., having 0-length) will have returned vectors filled by zeros.") explains the zero vectors in your resulting tensor.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With