I have saved a Gensim dictionary to disk. When I load it, the id2token attribute dict is not populated.
A simple piece of the code that saves the dictionary:
dictionary = corpora.Dictionary(tag_docs)
dictionary.save("tag_dictionary_lda.pkl")
Now when I load it (I'm loading it in an jupyter notebook), it still works fine for mapping tokens to IDs, but id2token does not work (I cannot map IDs to tokens) and in fact id2token is not populated at all.
> dictionary = corpora.Dictionary.load("../data/tag_dictionary_lda.pkl")
> dictionary.token2id["love"]
Out: 1613
> dictionary.doc2bow(["love"])
Out: [(1613, 1)]
> dictionary.id2token[1613]
Out:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input> in <module>()
----> 1 dictionary.id2token[1613]
KeyError: 1613
> list(dictionary.id2token.keys())
Out: []
Any thoughts?
You don't need the dictionary.id2token[1613] as you can use dictionary[1613] directly.
Note, that if you check the dictionary.id2token afterwards, it won't be empty any more. That's because the dictionary.id2token is formed only on request to save memory (as is stated during the init of Dictionary class).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With