Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Weighted summation of embeddings in pytorch

Tags:

pytorch

I have a sequence of 12 words which I represent using a 12x256 matrix (using word embeddings). Let us refer to these as . I wish to take this as input and output a 1x256 vector. However I don't want to use a (12x256) x 256 dense layer. Instead I want to create the output embedding using a weighted summation of the 12 embeddings

where the wi s are scalars (thus there is weight sharing).

How can I create trainable wi s in pytorch? I am new and only familiar with the standard modules like nn.Linear.

like image 453
elexhobby Avatar asked Sep 07 '25 05:09

elexhobby


2 Answers

You can implement this via 1D convolution with kernel_size = 1

import torch

batch_size=2

inputs = torch.randn(batch_size, 12, 256)
aggregation_layer = torch.nn.Conv1d(in_channels=12, out_channels=1, kernel_size=1)
weighted_sum = aggregation_layer(inputs)

Such convolution will have 12 parameters. Each parameter will be a equal to e_i in formula you provided.

In other words this convolution will ran over dimetion with size 256 and sum it with learnable weights.

like image 112
antoleb Avatar answered Sep 10 '25 07:09

antoleb


This should do the trick for weighted avg:

from torch import nn
import torch


class LinearWeightedAvg(nn.Module):
    def __init__(self, n_inputs):
        super(LinearWeightedAvg, self).__init__()
        self.weights = nn.ParameterList([nn.Parameter(torch.randn(1)) for i in range(n_inputs)])

    def forward(self, input):
        res = 0
        for emb_idx, emb in enumerate(input):
            res += emb * self.weights[emb_idx]
        return res


example_data = torch.rand(12, 256)
wa_layer = LinearWeightedAvg(12)
res = wa_layer(example_data)
print(res.shape)  

Answer inspired by a previous answer I received in the pytorch forums:
https://discuss.pytorch.org/t/dense-layer-with-different-inputs-for-each-neuron/47348

like image 22
erap129 Avatar answered Sep 10 '25 07:09

erap129