ES 复杂类型及其查询
作者:互联网
1、关于对象类型的操作和查询
创建索引,插入数据
PUT /blog { "mappings": { "properties": { "Content":{ "type": "text" }, "CreateTime":{ "type": "date", "format": "yyyy-MM-dd HH:mm:ss" }, "Author":{ "properties": { "UserName":{ "type":"keyword" }, "Adress": { "type": "text" } } } } } } PUT blog/_doc/1 { "Content":"i learn Elasticsearch", "time":"2020-01-01 00:00:00", "Author":{ "UserName":"mark", "Adress":"hangzhou" } }
现在需要统计作为为mark,文章内容为Elasticsearch的文档记录,代码如下:
GET blog/_search { "query": { "bool": { "must": [ {"match": { "Content": "Elasticsearch" }},{ "match": { "Author.UserName": "mark" } } ] } } }
搜索结果如下:
{ "took" : 0, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 0.5753642, "hits" : [ { "_index" : "blog", "_id" : "1", "_score" : 0.5753642, "_source" : { "Content" : "i learn Elasticsearch", "time" : "2020-01-01 00:00:00", "Author" : { "UserName" : "mark", "Adress" : "hangzhou" } } } ] } }
当嵌套对象只有一个时,搜索是正常的,但是注意下面关于2的用法
2、关于对象数组的操作
PUT /blog { "mappings": { "properties": { "Content":{ "type": "text" }, "CreateTime":{ "type": "date", "format": "yyyy-MM-dd HH:mm:ss" }, "Author":{ "properties": { "UserName":{ "type":"keyword" }, "Adress": { "type": "text" } } } } } } PUT blog/_doc/1 { "Content":"i learn Elasticsearch", "time":"2020-01-01 00:00:00", "Author":[ { "UserName":"mark", "Adress":"hangzhou" }, { "UserName":"jerry", "Adress":"shanghai" } ] }
当博客存在两个作者时,此时需要搜索作者名为mark,且联系地址时shanghai的记录,显然是不存在的,代码如下:
GET blog/_search { "query": { "bool": { "must": [ {"match": { "Author.Adress": "shanghai" }},{ "match": { "Author.UserName": "mark" } } ] } } }
搜索结果如下:
{ "took" : 2, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 0.64933956, "hits" : [ { "_index" : "blog", "_id" : "1", "_score" : 0.64933956, "_source" : { "Content" : "i learn Elasticsearch", "time" : "2020-01-01 00:00:00", "Author" : [ { "UserName" : "mark", "Adress" : "hangzhou" }, { "UserName" : "jerry", "Adress" : "shanghai" } ] } } ] } }
此时,存在结果,显示是不对的.这里看官方的介绍,当将字段描述成object类型时,存入的数组对象,es会移除对象数组中对象属性之间的关联关系,也就是说如下代码:
{ "UserName" : "mark", "Adress" : "hangzhou" }
es会移除UserName和Adress的关联关系,彼此是独立的,从而建立如下关系
{ "Author.Adress" : [ "hangzhou", "shanghai" ], "Author.UserName" : [ "mark", "jerry" ] }
所以失去了关联关系之后的搜索,只能按照keyvalue的形式进行搜索,从而返回值,所以这里must查询可以查询到结果,所以解决这个问题,只能通过将字段描述成Nested类型
3、Nested类型
3.1 解决object的问题
PUT /blog { "mappings": { "properties": { "Content":{ "type": "text" }, "CreateTime":{ "type": "date", "format": "yyyy-MM-dd HH:mm:ss" }, "Author":{ "type": "nested", "properties": { "UserName":{ "type":"keyword" }, "Adress": { "type": "text" } } } } } } PUT blog/_doc/1 { "Content":"i learn Elasticsearch", "time":"2020-01-01 00:00:00", "Author":[ { "UserName":"mark", "Adress":"hangzhou" }, { "UserName":"jerry", "Adress":"shanghai" } ] }
因为2中存在的问题,此时将Author描述成Nested类型,在执行如下搜索
GET blog/_search { "query": { "bool": { "must": [ {"match": { "Author.Adress": "shanghai" }},{ "match": { "Author.UserName": "mark" } } ] } } }
结果如下:
{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] } }
此时结果正确,Nested类型的文档会被保存在两个Lucene文档中,查询时做 join 处理
3.2 通过nested实现类似关系型数据库的join关联条件限制查询
查询文章内容中包含Elasticsearch且作者为mark的记录
GET blog/_search { "query": { "bool": { "must": [ { "match": { "Content": "Elasticsearch" } }, { "nested": { "path": "Author", "query": { "bool": { "must": [ { "match": { "Author.UserName": "mark" } } ] } } } } ] } } }
结果如下:
{ "took" : 222, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 1.3862944, "hits" : [ { "_index" : "blog", "_id" : "2", "_score" : 1.3862944, "_source" : { "Content" : "i learn Elasticsearch", "time" : "2020-01-01 00:00:00", "Author" : [ { "UserName" : "scott", "Adress" : "newyork" }, { "UserName" : "sam", "Adress" : "english" } ] } } ] } }
标签:UserName,00,Author,复杂,查询,blog,mark,Adress,ES 来源: https://www.cnblogs.com/GreenLeaves/p/16592472.html