LoginSignup
1
0

More than 5 years have passed since last update.

How to get aggregated data from nested schema using Elasticsearch DSL

Last updated at Posted at 2018-07-31

Environments

curl http://localhost:9200
{
  "name" : "xxxxx",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "xxxxx",
  "version" : {
    "number" : "6.2.4",
    "build_hash" : "xxxxx",
    "build_date" : "2018-04-12T20:37:28.497551Z",
    "build_snapshot" : false,
    "lucene_version" : "7.2.1",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}
$ python -m pip freeze
elasticsearch==6.3.0
elasticsearch-dsl==6.1.0

Introduction

nothing special

Schema

# show schema
$ curl -H "Accept: application/json" -H "Content-type: application/json" -XGET http://localhost:9200/index_name/_all/_mapping?pretty
{
  "index_name" : {
    "mappings" : {
      "index_name" : {
        "dynamic" : "strict",
        "_all" : {
          "enabled" : false
        },
        "properties" : {
          "annotate" : {
            "properties" : {
              "data" : {
                "type" : "nested",
                "properties" : {
                  "people" : {
                    "properties" : {
                      "name" : {
                        "type" : "keyword"
                      }
                    }
                  }
                }
              }
            }
          },
          "created_at" : {
            "type" : "date"
          },
          "id" : {
            "type" : "long"
          }
        }
      }
    }
  }
}

How to get

from elasticsearch_dsl import Search

body = {
    "size": 0,
    "aggs": {
        "group_by_state": {
            "nested": {
                "path": "annotate.data"
            },      
            "aggs":{
                "tagging":{
                    "terms": {
                        "field": "annotate.data.people.name",
                    }       
                }       
            }       
        }       
    }
}

s = Search.from_dict(body)
response = s.execute()

Result

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "group_by_state" : {
      "doc_count" : 15,
      "tagging" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : "John",
            "doc_count" : 2
          },
          {
            "key" : "Yamada",
            "doc_count" : 9
          },
          {
            "key" : "Lisa",
            "doc_count" : 1
          },
          {
            "key" : "Bob",
            "doc_count" : 3
          }
        ]
      }
    }
  }
}

By the way..

I had faced trouble because of mistake of schema before.
In detail, way of using "nested".
Official document says

Complex datatypesedit

Array datatype

  Array support does not require a dedicated type

Object datatype

  object for single JSON objects

Nested datatype

  nested for arrays of JSON objects

I had used schema like below


{
  "index_name" : {
    "mappings" : {
      "index_name" : {
        "dynamic" : "strict",
        "_all" : {
          "enabled" : false
        },
        "properties" : {
          "annotate" : {
            "type": "nested", # oops
            "properties" : {
              "data" : {
                "type" : "nested",
                "properties" : {
                  "people" : {
                    "type": "nested", # oops
                    "properties" : {
                      "name" : {
                        "type" : "keyword"
                      }
                    }
                  }
                }
              }
            }
          },
          "created_at" : {
            "type" : "date"
          },
          "id" : {
            "type" : "long"
          }
        }
      }
    }
  }
}

Refs

Field datatypes | Elasticsearch Reference [6.3] | Elastic
Search DSL — Elasticsearch DSL 6.2.1 documentation

1
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
0