Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to use two stop words dictionaries in Postgres?

I'm trying to get a list of most frequent words appearing in a column.

SELECT
  word,
  sum(nentry) AS nentry
FROM ts_stat(
  $$
    SELECT to_tsvector('simple', body)
    FROM document
  $$
)
GROUP BY word

This works pretty well, but the problem is that documents contain words in French and English. If I use the English dictionary for stop words, the most frequent word I get is pour, and it's the when I use the French one. Those are two words I obviously want to exclude.

Is there a way to create a configuration that uses two different dictionaries for stop words ?

like image 398
Thomas Groutars Avatar asked Sep 20 '25 20:09

Thomas Groutars


1 Answers

You should create a stop word file that is the union of the French and English stop word files and create a simple dictionary with that stop word file.

Then create a text search configuration that uses this dictionary for asciiword and word and use this configuration.

like image 90
Laurenz Albe Avatar answered Sep 22 '25 09:09

Laurenz Albe



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!