I've been trying to practise what I've learned from this tutorial:(https://realpython.com/sentiment-analysis-python/) using PyCharm.
And this line:
textcat.add_label("pos")
generated a warning: Cannot find reference 'add_label' in '(Doc) -> Doc | (Doc) -> Doc'
I understand that this is because "nlp.create_pipe()" generates a Doc not a string, but (essentially because I don't know what to do in this case!) I ran the script anyway, but then I got the an error from this line:
textcat = nlp.create_pipe("textcat", config={"architecture": "simple_cnn"})
Error msg:
raise ConfigValidationError(
thinc.config.ConfigValidationError:
Config validation error
textcat -> architecture extra fields not permitted
{'nlp': <spacy.lang.en.English object at 0x0000015E74F625E0>, 'name': 'textcat', 'architecture': 'simple_cnn', 'model': {'@architectures': 'spacy.TextCatEnsemble.v2', 'linear_model': {'@architectures': 'spacy.TextCatBOW.v1', 'exclusive_classes': True, 'ngram_size': 1, 'no_output_layer': False}, 'tok2vec': {'@architectures': 'spacy.Tok2Vec.v2', 'embed': {'@architectures': 'spacy.MultiHashEmbed.v1', 'width': 64, 'rows': [2000, 2000, 1000, 1000, 1000, 1000], 'attrs': ['ORTH', 'LOWER', 'PREFIX', 'SUFFIX', 'SHAPE', 'ID'], 'include_static_vectors': False}, 'encode': {'@architectures': 'spacy.MaxoutWindowEncoder.v2', 'width': 64, 'window_size': 1, 'maxout_pieces': 3, 'depth': 2}}}, 'threshold': 0.5, '@factories': 'textcat'}
I'm using:
Man! Did the that full spaCy upgrade really obliterate that tutorial or what...
There's a couple things you might be able to get around. I haven't fully fixed that broken tutorial. It's on the To-Do list. However, I did get around the exact issue you're having.
textcat = nlp.create_pipe("textcat", config={"architecture": "simple_cnn"})
This create_pipe behavior has been deprecated so you can just directly add to the workflow with add_pipe. So one thing you could do is the following:
from spacy.pipeline.textcat import single_label_cnn_config
<more good code>
nlp = spacy.load("en_core_web_trf")
if "textcat" not in nlp.pipe_names:
     nlp.add_pipe('textcat', config=single_label_cnn_config, last=True)
textcat = nlp.get_pipe('textcat')
textcat.add_label("pos")
textcat.add_label("neg")
Let me know if this makes sense and helps. I'll try to revamp the tutorial entirely from spaCy in the coming weeks.
This seems to have worked with spacy 3.1.0,
import en_core_web_md # or skip, see below
from spacy.pipeline.textcat import Config, single_label_cnn_config
nlp = en_core_web_md.load() # or nlp=spacy.load("en_core_web_sm")
config = Config().from_str(single_label_cnn_config)
if "textcat" not in nlp.pipe_names:
     nlp.add_pipe('textcat', config=config, last=True)
nlp.pipe_names
# ['tok2vec', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer', 'ner', 'textcat']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With