I would like to extract keywords from short dutch texts. Is there an API for this or some library which i could use.
In case those are not available for dutch, any tips on how to extract them myself are also appreciated. I already tried it myself by running the texts through a part of speech tagger and lemmatizer. But from then on i find it quite difficult to extract decent keywords. TF-IDF is not useful sice texts are too short to get good results.
I prefer Java, but any other language implementations are also very welcome.
Here is my video series on text mining with RapidMiner. It shows how to easily get the TF-IDF and more:
http://vancouverdata.blogspot.ca/2010/11/text-analytics-with-rapidminer-loading.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With