I want to do text analysis on a french texts to visualise similarities between those texts, the possible class depending on the words that are used. I ask for your help because I just started working with Python and I would like to know the best way to do text analysis in Python taking into account that my texts are in French ?
Are there libraries specially designed for french texts ? The uses would be to clean the data, and further to analyse the data.
I can already :
What I can't do with French words : pass to singular, pass verbs to the infinitive form...
Spacy library and Treetagger tool (that you can use through treetaggerwrapper library) have good french support.
Example using spacy :
import spacy
nlp_fr = spacy.load('fr_core_news_sm')
text = "J'ai mangé des pommes hier"
tokens = nlp_fr(text)
for token in tokens:
print(token.lemma_)
Prints :
je
avoir
manger
un
pomme
hier
Treetagger is more difficult to install but this can help you and here is the documentation of the python wrapper.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With