Visualize the cosine similarity scores calculated using pretrained word embeddding in SpaCy

Question

I have used SpaCy's pretrained model 'en_core_web_lg' to find the cosine distance between a group of values and attributes. I wanted to visualize the relationship of how close a word is from the other word, very much similar to clustering.

Here is the link to the table which contains similarity scores for each value vs attribute

Here the columns are the attributes for which i am trying to find the similarity score, while the row are the values for which i am trying to find what attribute it is most likely to be classified

This is the output i am trying to achieve. Please take a look at it

Stefano Fiorucci - anakin87 · Accepted Answer

If you want a plot similar to this: tSNE plot you need to reduce the dimensionality of your word vectors to 2 dimensions.

So, you have to apply to the desired word vectors a dimensionality reduction algorithm, such as t-SNE (which is also implemented in scikit-learn).

Similarity scores are not sufficient to do this; you need whole vectors.

Here, there is a nice Kaggle tutorial about t-SNE for visualizing word vectors. You can customize it, choosing only the words in which you are interested.

Visualize the cosine similarity scores calculated using pretrained word embeddding in SpaCy

Tags:

python-3.x

nlp

word-embedding

spacy

Arpit Sah

1 Answers

Stefano Fiorucci - anakin87

Recent Activity

Donate For Us

Visualize the cosine similarity scores calculated using pretrained word embeddding in SpaCy

Tags:

python-3.x

nlp

word-embedding

spacy

Arpit Sah

1 Answers

Stefano Fiorucci - anakin87

Related questions

Recent Activity

Donate For Us