How to save as a gensim word2vec file?

Question

I have two lists, A is a list of words, for example ["hello","world",......], Len(A) is 10000. List B contains the all pre-trained vectors corresponding to A, which is a [10000,512], 512 is the vector dimension. I want to convert two lists into gensim word2vec model format in order to load the model in later, such as model = Word2Vec.load("word2vec.model") how should I do this?

gojomo · Accepted Answer

As you only have the words and their vectors, you don't quite have enough info for a full Word2Vec model (which includes other things like the internal neural network's hidden weights, and word frequencies).

But you can create a gensim KeyedVectors object, of the general kind that's in a gensim Word2Vec model .wv property. It has many of the helper methods (like most_similar()) you may be interested in using.

Let's assume your A list-of-words is in a more-helpfully named Python list called words_list, and your B list-of-vectors is in a more-helpfully named Python list called 'vectors_list`.

Try:

from gensim.models import KeyedVectors
kv = new KeyedVectors(512)
kv.add(words_list, vectors_list)
kv.save(`mywordvecs.kvmodel`)

You could then later re-load these via:

kv2 = KeyedVectors.load(`mywordvecs.kvmodel`)

(You could also use save_word2vec_format() and load_word2vec_format() instead of gensim's native save()/load(), if you wanted simpler plain-vectors formats that could also be loaded by other tools that use that format. But if you're staying within gensim, the plain save()/load() are just as good – and would be better if saving a more complex trained Word2Vec model, because they'd retain the extra info those objects contain.)

How to save as a gensim word2vec file?

Tags:

gensim

word2vec

HAO CHEN

1 Answers

gojomo

Recent Activity

Donate For Us

How to save as a gensim word2vec file?

Tags:

gensim

word2vec

HAO CHEN

1 Answers

gojomo

Related questions

Recent Activity

Donate For Us