I am using word2vec pretrained embedding in PyTorch (following code here). However, it does not seem to handle unseen words. Is there any good way to solve it?
FastText builds character ngram vectors as part of model training. When it finds an OOV word, it sums the character ngram vectors in the word to produce a vector for the word. You can find more detail here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With