Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

For an elastic search index, how to get the documents where array field has length greater than 0?

In elastic search index, how to get the documents where array field has length greater than 0?

I tried following multiple syntaxes but didn't get any breakthrough. I got same error in all of the syntaxes.

GET http://{{host}}:{{elasticSearchPort}}/student_details/_search

Syntax 1:

{
  "query": {
    "bool": {
      "filter": {
        "script": {
          "script": {
            "source": "doc['enrolledCourses'].values.length > 0",
            "lang": "painless"
          }
        }
      }
    }
  }
}

Error:

"caused_by": {
  "type": "illegal_argument_exception",
  "reason": "No field found for [enrolledCourses] in mapping with types []"
}

Syntax 2:

{
  "query": {
    "bool": {
      "filter": {
        "script": {
          "script": {
            "source": "doc['enrolledCourses'].values.size() > 0",
            "lang": "painless"
          }
        }
      }
    }
  }
}

Error:

"caused_by": {
  "type": "illegal_argument_exception",
  "reason": "No field found for [enrolledCourses] in mapping with types []"
}

Syntax 3:

{
  "query": {
    "bool": {
      "filter" : {
        "script" : {
          "script" : "doc['enrolledCourses'].values.size() > 0"
         }
      }
    }
  }
}

Error:

"caused_by": {
  "type": "illegal_argument_exception",
  "reason": "No field found for [enrolledCourses] in mapping with types []"
}

Syntax 4:

{
  "query": {
    "bool": {
      "filter" : {
        "script" : {
          "script" : "doc['enrolledCourses'].values.length > 0"
         }
      }
    }
  }
}

Error:

"caused_by": {
  "type": "illegal_argument_exception",
  "reason": "No field found for [enrolledCourses] in mapping with types []"
}   

Please help me in solving this.

like image 540
prat Avatar asked Sep 12 '25 15:09

prat


1 Answers

I don't know what version of elastic you run, then all my test I'd running on latest 7.9.0 version of Elasticsearch.

I will use painless script for scripting.

I put to documents to index test:

PUT test/_doc/1
{
   "name": "Vasia",
   "enrolledCourses" : ["test1", "test2"]
}

PUT test/_doc/2
{
   "name": "Petya"
}

How you can see one document contains enrolledCourses field and second not.

In painless you don't need use values field and you can take length directly, this is according to painless documentation. Then I skip using values operator in my script:

GET test/_search
{
   "query": {
      "bool": {
        "filter": [
          {
            "script": {
              "script": {
                "source": "doc['enrolledCourses'].length > 0",
                "lang": "painless"
              }
            }
          }
        ]
      }
   }
}

After running I'd received 2 different errors:

{
          "type" : "script_exception",
          "reason" : "runtime error",
          "script_stack" : [
            "org.elasticsearch.index.mapper.TextFieldMapper$TextFieldType.fielddataBuilder(TextFieldMapper.java:757)",
            "org.elasticsearch.index.fielddata.IndexFieldDataService.getForField(IndexFieldDataService.java:116)",
            "org.elasticsearch.index.query.QueryShardContext.lambda$lookup$0(QueryShardContext.java:331)",
            "org.elasticsearch.search.lookup.LeafDocLookup$1.run(LeafDocLookup.java:97)",
            "org.elasticsearch.search.lookup.LeafDocLookup$1.run(LeafDocLookup.java:94)",
            "java.base/java.security.AccessController.doPrivileged(AccessController.java:312)",
            "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:94)",
            "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:41)",
            "doc['enrolledCourses'].length > 0",
            "    ^---- HERE"
          ]
}

and
{
            "type" : "illegal_argument_exception",
            "reason" : "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [enrolledCourses] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}

Both of errors is pretty clear. First for document where field doesn't exists and second because Elasticsearch indexed string array field as default mapping type text.

Both of cases is very easy to fix by mapping enrolledCourses field as keyword. In first case mapping will always provide empty field and in second keyword word be allow to run fielddata property.

PUT test 
{
   "settings": {
     "number_of_replicas": 0
   },
   "mappings": {
      "properties": {
         "name": {
           "type": "keyword"
         },
         "enrolledCourses": {
            "type": "keyword"
         }
      }
   }
}

Now I will receive right answer for query:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.0,
    "hits" : [
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.0,
        "_source" : {
          "name" : "Vasia",
          "enrolledCourses" : [
            "test1",
            "test2"
          ]
        }
      }
    ]
  }
}
like image 137
ozlevka Avatar answered Sep 14 '25 18:09

ozlevka