其他分享
首页 > 其他分享> > ES 复杂类型及其查询

ES 复杂类型及其查询

作者:互联网

1、关于对象类型的操作和查询

创建索引,插入数据

PUT /blog
{
  "mappings": {
    "properties": {
      "Content":{
        "type": "text"
      },
      "CreateTime":{
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss"
      },
      "Author":{
        "properties": {
          "UserName":{
            "type":"keyword"
          },
           "Adress": {
            "type": "text"
          }
        }
      }
    }
  }
}

PUT blog/_doc/1
{
  "Content":"i learn Elasticsearch",
  "time":"2020-01-01 00:00:00",
  "Author":{
    "UserName":"mark",
    "Adress":"hangzhou"
  }
}

现在需要统计作为为mark,文章内容为Elasticsearch的文档记录,代码如下:

GET blog/_search
{
  "query": {
    "bool": {
      "must": [
        {"match": {
          "Content": "Elasticsearch"
        }},{
          "match": {
            "Author.UserName": "mark"
          }
        }
      ]
    }
  }
}

搜索结果如下:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.5753642,
    "hits" : [
      {
        "_index" : "blog",
        "_id" : "1",
        "_score" : 0.5753642,
        "_source" : {
          "Content" : "i learn Elasticsearch",
          "time" : "2020-01-01 00:00:00",
          "Author" : {
            "UserName" : "mark",
            "Adress" : "hangzhou"
          }
        }
      }
    ]
  }
}

当嵌套对象只有一个时,搜索是正常的,但是注意下面关于2的用法

 

2、关于对象数组的操作

PUT /blog
{
  "mappings": {
    "properties": {
      "Content":{
        "type": "text"
      },
      "CreateTime":{
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss"
      },
      "Author":{
        "properties": {
          "UserName":{
            "type":"keyword"
          },
           "Adress": {
            "type": "text"
          }
        }
      }
    }
  }
}

PUT blog/_doc/1
{
  "Content":"i learn Elasticsearch",
  "time":"2020-01-01 00:00:00",
  "Author":[
    {
    "UserName":"mark",
    "Adress":"hangzhou"
    },
  {
    "UserName":"jerry",
    "Adress":"shanghai"
  }
    ]
}

当博客存在两个作者时,此时需要搜索作者名为mark,且联系地址时shanghai的记录,显然是不存在的,代码如下:

GET blog/_search
{
  "query": {
    "bool": {
      "must": [
        {"match": {
          "Author.Adress": "shanghai"
        }},{
          "match": {
            "Author.UserName": "mark"
          }
        }
      ]
    }
  }
}

搜索结果如下:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.64933956,
    "hits" : [
      {
        "_index" : "blog",
        "_id" : "1",
        "_score" : 0.64933956,
        "_source" : {
          "Content" : "i learn Elasticsearch",
          "time" : "2020-01-01 00:00:00",
          "Author" : [
            {
              "UserName" : "mark",
              "Adress" : "hangzhou"
            },
            {
              "UserName" : "jerry",
              "Adress" : "shanghai"
            }
          ]
        }
      }
    ]
  }
}

此时,存在结果,显示是不对的.这里看官方的介绍,当将字段描述成object类型时,存入的数组对象,es会移除对象数组中对象属性之间的关联关系,也就是说如下代码:

{
     "UserName" : "mark",
     "Adress" : "hangzhou"
}

es会移除UserName和Adress的关联关系,彼此是独立的,从而建立如下关系

{
  "Author.Adress" : [ "hangzhou", "shanghai" ],
  "Author.UserName" :  [ "mark", "jerry" ]
}

所以失去了关联关系之后的搜索,只能按照keyvalue的形式进行搜索,从而返回值,所以这里must查询可以查询到结果,所以解决这个问题,只能通过将字段描述成Nested类型

 

3、Nested类型

3.1 解决object的问题

PUT /blog
{
  "mappings": {
    "properties": {
      "Content":{
        "type": "text"
      },
      "CreateTime":{
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss"
      },
      "Author":{
        "type": "nested", 
        "properties": {
          "UserName":{
            "type":"keyword"
          },
           "Adress": {
            "type": "text"
          }
        }
      }
    }
  }
}

PUT blog/_doc/1
{
  "Content":"i learn Elasticsearch",
  "time":"2020-01-01 00:00:00",
  "Author":[
    {
    "UserName":"mark",
    "Adress":"hangzhou"
    },
  {
    "UserName":"jerry",
    "Adress":"shanghai"
  }
    ]
}

因为2中存在的问题,此时将Author描述成Nested类型,在执行如下搜索

GET blog/_search
{
  "query": {
    "bool": {
      "must": [
        {"match": {
          "Author.Adress": "shanghai"
        }},{
          "match": {
            "Author.UserName": "mark"
          }
        }
      ]
    }
  }
}

结果如下:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

此时结果正确,Nested类型的文档会被保存在两个Lucene文档中,查询时做 join 处理

 

3.2 通过nested实现类似关系型数据库的join关联条件限制查询

查询文章内容中包含Elasticsearch且作者为mark的记录

GET blog/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "Content": "Elasticsearch"
          }
        },
        {
          "nested": {
            "path": "Author",
            "query": {
              "bool": {
                "must": [
                  {
                    "match": {
                      "Author.UserName": "mark"
                    }
                  }
                ]
              }
            }
          }
        }
      ]
    }
  }
}

结果如下:

{
  "took" : 222,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.3862944,
    "hits" : [
      {
        "_index" : "blog",
        "_id" : "2",
        "_score" : 1.3862944,
        "_source" : {
          "Content" : "i learn Elasticsearch",
          "time" : "2020-01-01 00:00:00",
          "Author" : [
            {
              "UserName" : "scott",
              "Adress" : "newyork"
            },
            {
              "UserName" : "sam",
              "Adress" : "english"
            }
          ]
        }
      }
    ]
  }
}

 

标签:UserName,00,Author,复杂,查询,blog,mark,Adress,ES
来源: https://www.cnblogs.com/GreenLeaves/p/16592472.html