Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

spaCy library to extract noun phrase - ValueError: [E866] Expected a string or 'Doc' as input, but got: <class 'float'>

currently I'm trying to extract noun phrase from sentences. The sentences were stored in a column in excel file. Here the code using python:

import pandas as pd
import spacy

df = pd.read_excel("xxx.xlsx")

nlp = spacy.load("en_core_web_md")
for row in range(len(df)):
    doc = nlp(df.loc[row, "Title"])
    for np in doc.noun_chunks:
        print(np.text)

But I got this error:

Traceback (most recent call last):
  File "/Users/pusinov/PycharmProjects/textsummarizer/paper_term_extraction.py", line 10, in <module>
    doc = nlp(df.loc[row, "Title"])
  File "/Users/pusinov/PycharmProjects/textsummarizer/venv/lib/python3.9/site-packages/spacy/language.py", line 1002, in __call__
    doc = self._ensure_doc(text)
  File "/Users/pusinov/PycharmProjects/textsummarizer/venv/lib/python3.9/site-packages/spacy/language.py", line 1093, in _ensure_doc
    raise ValueError(Errors.E866.format(type=type(doc_like)))
ValueError: [E866] Expected a string or 'Doc' as input, but got: <class 'float'>.

Can anyone help me to make better code? Thank you very much.

p.s. I'm still newbie in python

like image 565
researchcollege111 Avatar asked Sep 19 '25 22:09

researchcollege111


1 Answers

I faced a similar issue and I fixed it using

df['Title']= df['Title'].astype(str)

The use of this code will fix the problem. As you have to convert all the data values to str format (usually it happens as comment might be number, or nan or null).

like image 117
Prashanth R Avatar answered Sep 22 '25 11:09

Prashanth R



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!