Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A vector and matrix rows cosine similarity in pytorch

In pytorch, I have multiple (scale of hundred thousand) 300 dim vectors (which I think I should upload in a matrix), I want to sort them by their cosine similarity with another vector and extract the top-1000. I want to avoid for loop as it is time consuming. I was looking for an efficient solution.

like image 255
user3531835 Avatar asked Oct 28 '25 09:10

user3531835


1 Answers

You can use torch.nn.functional.cosine_similarity function for computing cosine similarity. And torch.argsort to extract top 1000.

Here is an example:

x = torch.rand(10000,300)
y = torch.rand(1,300)
dist = F.cosine_similarity(x,y)
index_sorted = torch.argsort(dist)
top_1000 = index_sorted[:1000]

Please note the shape of y, don't forget to reshape before calling similarity function. Also note that argsort simply returns the indexes of closest vectors. To access those vectors themselves, just write x[top_1000], which will return a matrix shaped (1000,300).

like image 142
Shihab Shahriar Khan Avatar answered Nov 01 '25 07:11

Shihab Shahriar Khan



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!