MongoDB version: 3.4.4
Documents in the MongoDB collection were created from the XML files (not GridFS) and look like this one:
{
    ...
    "СвНаимЮЛ" : {
        "@attributes" : {
            "НаимЮЛПолн" : "ОБЩЕСТВО С ОГРАНИЧЕННОЙ ОТВЕТСТВЕННОСТЬЮ \"КОНСАЛТИНГОВАЯ КОМПАНИЯ \"ГОТЛИБ ЛИМИТИД\"",
            ...
        },
        ...
    }
    ...
}
Language is Russian. Collection has about 10,000,000 documents and a text index on the field "СвНаимЮЛ.@attributes.НаимЮЛПолн".
Search by one word is very fast:
db.records.find({
    $text: {
        $search: "ГОТЛИБ"
    }
})
But search by several words with logical AND is so slow that I can't even wait until it ends to get explain('executionStats') results.
E.g. next query is very slow. Find all documents which contain words "ГОТЛИБ" AND "ЛИМИТИД":
db.records.find({
    $text: {
        $search: "\"ГОТЛИБ\" \"ЛИМИТИД\""
    }
})
Search by phrase is also slow. E.g find all documents which contain phrase "ГОТЛИБ ЛИМИТИД":
db.records.find({
    $text: {
        $search: "\"ГОТЛИБ ЛИМИТИД\""
    }
})
getIndexes() output:
[
        {
                "v" : 2,
                "key" : {
                        "_id" : 1
                },
                "name" : "_id_",
                "ns" : "egrul.records"
        },
        ...
        {
                "v" : 2,
                "key" : {
                        "_fts" : "text",
                        "_ftsx" : 1
                },
                "name" : "СвНаимЮЛ.@attributes.НаимЮЛПолн_text",
                "ns" : "egrul.records",
                "default_language" : "russian",
                "weights" : {
                        "СвНаимЮЛ.@attributes.НаимЮЛПолн" : 1
                },
                "language_override" : "language",
                "textIndexVersion" : 3
        }
]
Can I somehow increase search-by-several-words (with logical AND) or search-by-phrase speed?
Just found that search by multiple words with logical OR is also slow:
db.records.find({
    $text: {
        $search: "ГОТЛИБ ЛИМИТИД"
    }
})
Looks like the problem is not with slow search-by-multiple-words, but with slow search if search term appears in many documents.
E. g. the word "МИЦУБИСИ" appears only in 24 (from 10,000,000) documents so the query
db.records.find({
    $text: {
        $search: "МИЦУБИСИ"
    }
}).count()
is very fast.
But the word "СЕРВИС" appears in 160,000 documents and the query
db.records.find({
    $text: {
        $search: "СЕРВИС"
    }
}).count()
is very slow (takes about 40 minutes).
Query
db.records.find({
    $text: {
        $search: "\"МИЦУБИСИ\" \"СЕРВИС\""
    }
}).count()
is also slow because (I suppose) MongoDB looks for terms "МИЦУБИСИ" (fast) and "СЕРВИС" (slow) and then make intersection or something.
Now I want to find a way to limit the number of results something like find 10 documents and stop because limit() doesn't work with text queries. .
Or maybe upgrade my server hardware.
Or look at the Elasticsearch.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With