I'm trying to learn how to properly add "synonym functionality" to my existing ElasticSearch set up. Here's what I understand so far about the process. I'd appreciate it if you could point out any misunderstandings I have - I'm very new to elasticsearch.
From this page I've learned that I need to add a synonym analyser and a synonym filter with a path to my synonyms file to my index config so that it looks like this:
{
"index" : {
"analysis" : {
"analyzer" : {
"synonym" : {
"tokenizer" : "whitespace",
"filter" : ["synonym"]
}
},
"filter" : {
"synonym" : {
"type" : "synonym",
"format" : "wordnet",
"synonyms_path" : "analysis/wordnet_synonyms.txt"
}
}
}
}
}
From this page I've learned how to add an analyser:
curl -XPOST 'localhost:9200/myindex/_close'
curl -XPUT 'localhost:9200/myindex/_settings' -d '{
"analysis" : {
"analyzer":{
"synonym":{
"tokenizer":"whitespace",
"filter" : ["synonym"]
}
}
}
}'
curl -XPOST 'localhost:9200/myindex/_open'
But I don't know how to add the filter. Would it be as simple as this?:
curl -XPOST 'localhost:9200/myindex/_close'
curl -XPUT 'localhost:9200/myindex/_settings' -d '{
"analysis" : {
"filter":{
"synonym":{
"type" : "synonym",
"format" : "wordnet",
"synonyms_path" : "analysis/wordnet_synonyms.txt",
"ignore_case" : true
}
}
}
}'
curl -XPOST 'localhost:9200/myindex/_open'
I also don't know where the analysis/wordnet_synonyms.txt is relative to. On this page it says "relative to the config location". Where is the config location? In etc/elasticsearch somewhere (on Ubuntu)? Thanks!
Edit: This answer gives this as a solution:
curl -XPOST 'localhost:9200/myindex/_close'
curl -XPUT 'localhost:9200/myindex/_settings' -d '{
"analysis" : {
"analyzer" : {
"synonym" : {
"tokenizer" : "whitespace",
"filter" : ["synonym"]
}
},
"filter":{
"synonym":{
"type" : "synonym",
"format" : "wordnet",
"synonyms_path" : "analysis/wordnet_synonyms.txt",
"ignore_case" : true
}
}'
curl -XPOST 'localhost:9200/myindex/_open'
Is this possible? A commenter said that the index would need to be recreated when changing analyser settings - is this true? And I'm still not sure where to put "wordnet_synonyms.txt".
The easiest way is to first delete your index and then create it with the analyzer and synonym token filter, like this (I've also added a mapping type and a dummy field to show you how to use your analyzer):
curl -XDELETE localhost:9200/myindex
curl -XPUT localhost:9200/myindex -d '{
"settings": {
"index" : {
"analysis" : {
"analyzer" : {
"synonym" : {
"tokenizer" : "whitespace",
"filter" : ["synonym"]
}
},
"filter" : {
"synonym" : {
"type" : "synonym",
"format" : "wordnet",
"synonyms_path" : "analysis/wordnet_synonyms.txt"
}
}
}
}
},
"mappings": {
"typename": {
"fieldname": {
"type": "string",
"analyzer": "synonym"
}
}
}
}'
You need to put the analysis/wordnet_synonyms.txt file in the same folder as your elasticsearch.yml configuration file. On Ubuntu, it would be in
/etc/elasticsearch/analysis/wordnet_synonyms.txt
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With