Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what does offsets mean in pytorch nn.EmbeddingBag?

Tags:

pytorch

I know offsets meaning when it has two numbers, but what does it mean when more than two numbers,for example:

weight = torch.FloatTensor([[1, 2, 3], [4, 5, 6]])
embedding_sum = nn.EmbeddingBag.from_pretrained(weight, mode='sum')
print(list(embedding_sum.parameters()))
input = torch.LongTensor([0,1])
offsets = torch.LongTensor([0,1,2,1])

print(embedding_sum(input, offsets))

the result is :

[Parameter containing:
tensor([[1., 2., 3.],
        [4., 5., 6.]])]
tensor([[1., 2., 3.],
        [4., 5., 6.],
        [0., 0., 0.],
        [0., 0., 0.]])

who can help me?

like image 608
andy Avatar asked Dec 06 '25 11:12

andy


1 Answers

As shown in the source code,

return F.embedding(
    input, self.weight, self.padding_idx, self.max_norm,
    self.norm_type, self.scale_grad_by_freq, self.sparse) 

It uses the functional embedding bag, which explains the offsets parameters as

offsets (LongTensor, optional) – Only used when input is 1D. offsets determines the starting index position of each bag (sequence) in input.

In the EmbeddingBag docs:

If input is 1D of shape (N), it will be treated as a concatenation of multiple bags (sequences). offsets is required to be a 1D tensor containing the starting index positions of each bag in input. Therefore, for offsets of shape (B), input will be viewed as having B bags. Empty bags (i.e., having 0-length) will have returned vectors filled by zeros.

The last statement ("Empty bags (i.e., having 0-length) will have returned vectors filled by zeros.") explains the zero vectors in your resulting tensor.

like image 99
ndrwnaguib Avatar answered Dec 09 '25 06:12

ndrwnaguib



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!