When I run the transformer_tutorial code from Pytorch (https://pytorch.org/tutorials/beginner/transformer_tutorial.html), I meet a problem in build_vocab_from_iterator.
from torchtext.datasets import WikiText2
from torchtext.data.utils import get_tokenizer
from torchtext.vocab import build_vocab_from_iterator
train_iter = WikiText2(split='train')
tokenizer = get_tokenizer('basic_english')
vocab = build_vocab_from_iterator(map(tokenizer, train_iter), specials=['<unk>'])
AttributeError: 'NoneType' object has no attribute 'Lock'
This exception is thrown by __iter__ of _MemoryCellIterDataPipe(remember_elements=1000, source_datapipe=_ChildDataPipe)
I tried with other torchtext.dataset such as the following codes:
from torchtext.datasets import IMDB
train_iter = IMDB(split='train')
def tokenize(label, line):
return line.split()
tokens = []
for label, line in train_iter:
tokens += tokenize(label, line)
still return the same error. I run all the codes in Google Colab.
I tried to run the codes in different version of pytorch and corresponding pytorchtext, but it failed. I really appreciate it if you could give me some help. Thanks!
in my case the code ran by just restarting the runtime in google colab and it happens most of times that the code of getting pytorch dataset gives error and is corrected easily by restarting runtime.
i hope it helps you
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With