Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Restrict entity types in Spacy NER

Tags:

spacy

I'm using Spacy large model but it's incorrectly tagging entities with categories that are not relevant to my domain, eg 'work of art' can cause it not to recognise what should have been an Org.

Is it possible to restrict NER to only return People, Locations and Organisations ?

like image 794
user1134477 Avatar asked Oct 18 '25 17:10

user1134477


1 Answers

Short answer:

No, you cannot restrict NER to not tag specific Tags or the opposite.

What you can do is limit it in code or modify the model [see long answer].

Limiting it in code is just filtering the retrieved entities, but it won't solve your problem with missclassifications.

import spacy

nlp = spacy.load("en_core_web_sm")
doc = nlp(u"Apple is looking at buying U.K. startup for $1 billion")

entities = [ent for ent in doc.ents if ent.label_ == "ORG"]

Long answer:

You can restrict NER in spacy, but not with a simple parameter (currently).

Why not? Simple: NER is a supervised machine learning task. You provide text with tagged entities, it trains and then attempts to predict new instances from the parameters it learned beforehand.

If you want NER only to recognize certain entities, such as orgs, you have to train a new model only with org instances.

If you're familiar with Machine Learning concepts, you'll understand it this way: in a multi class classification task, you cannot simply remove a class without retraining the entire model with filtered train data.

Check this page for more info on NER training: https://spacy.io/usage/linguistic-features/#named-entities

like image 95
Tiago Duque Avatar answered Oct 21 '25 11:10

Tiago Duque