Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

huggingface - pegasus PegasusTokenizer is None

Trying to use tuner007/pegasus_paraphrase. Followed the examples in Pegasus.

The Pegasus model was proposed in PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization by Jingqing Zhang, Yao Zhao, Mohammad Saleh and Peter J. Liu on Dec 18, 2019.

Problem:

PegasusTokenizer cannot be instantiated as PegasusTokenizer.from_pretrained(model_name) returns None. Using the 'google/pegasus-xsum' as the model name caused the same.

from transformers import PegasusForConditionalGeneration, PegasusTokenizer
model_name = 'tuner007/pegasus_paraphrase'
tokenizer = PegasusTokenizer.from_pretrained(model_name)

type(tokenizer)
---
NoneType

Please suggest how to work it around.

like image 839
mon Avatar asked Oct 19 '25 03:10

mon


1 Answers

You need to install sentence piece library needed for tokenizer to work properly. To install it run:

pip install sentencepiece

Actually the error occurred because you imported the tokenizer first before installing sentencepiece and after receiving the error you installed it without restarting the session. Make sure you install sentence piece before importing the tokenizer.

like image 87
Arpit Rajauria Avatar answered Oct 22 '25 05:10

Arpit Rajauria