reindex简单使用学习总结,总结了在一些场景中使用reindex做Elasticsearch数据迁移的方式。
作者:互联网
1.简单的reindex
source里是源index,dest里是目标索引。remote里必须是在新集群中加入了白名单的ip和port
POST _reindex
{
"source": {
"remote": {
"host": "http://ip:port"
},
"index": "index1"
},
"dest": {
"index": "index2"
}
}
2.只reindex目标索引中缺少的
op_type设置为create,只迁移目标索引中没有但老集群有的数据
POST _reindex
{
"source": {
"remote": {
"host": "http://ip:port"
},
"index": "index1"
},
"dest": {
"index": "index2",
"op_type": "create"
}
}
3.设置批次大小
通过设置size来实现,默认的size是1000
POST _reindex
{
"source": {
"remote": {
"host": "http://ip:port"
},
"index": "index1",
"size": 2000
},
"dest": {
"index": "index2"
}
}
4.遇到冲突继续
通过设置 “conflicts”: “proceed” 和 “op_type”: “create” 实现
POST _reindex
{
"conflicts": "proceed",
"source": {
"remote": {
"host": "http://ip:port"
},
"index": "index1"
},
"dest": {
"index": "index2",
"op_type": "create"
}
}
5.只reindex符合条件的数据
通过dsl查询实现,查出需要reindex的数据。
POST _reindex
{
"source": {
"remote": {
"host": "http://ip:port"
},
"index": "index1",
"query": {
"term": { "name": "zs" }
}
},
"dest": {
"index": "index2"
}
}
6.只reindex 源索引中的部分字段
通过 _source 指定要reindex的字段
POST _reindex
{
"source": {
"remote": {
"host": "http://ip:port"
},
"index": "index1",
"_source": [ "column1","column2" ]
},
"dest": {
"index": "index2"
}
}
7.屏蔽不想reindex的数据
使用 excludes 屏蔽不想reindex的字段
POST _reindex
{
"source": {
"remote": {
"host": "http://ip:port"
},
"index": "index1",
"excludes": [ "column1","column2" ]
},
"dest": {
"index": "index2"
}
}
8.用script脚本在reindex时做数据处理
通过painless实现。painless是es 5.x以后推出的一种简单,安全的脚本语言。也是es 5.x以后默认的脚本语言。
此实例时将boolean中的True转换为true.
POST _reindex
{
"source": {
"remote": {
"host": "http://ip:port"
},
"index": "index1"
},
"dest": {
"index": "index2"
},
"script": {
"inline": "if (ctx._source.auth != null) {ctx._source.column=ctx._source.column.toString().toLowerCase();} ",
"lang": "painless"
}
}
9.字段重新命名
同样是使用script,将name属性重命名为newName
POST _reindex
{
"source": {
"remote": {
"host": "http://ip:port"
},
"index": "index1"
},
"dest": {
"index": "index2"
},
"script": {
"inline": "ctx._source.newName = ctx._source.remove(\"name\")",
"lang": "painless"
}
}
10.客户端双写时
通过设置"conflicts":“proceed” 和 “version_type”: “external” 来保证保证version低的不覆盖
{
"conflicts":"proceed",
"source": {
"remote": {
"host": "http://ip:port"
},
"index": "index1"
},
"dest": {
"version_type": "external",
"index": "index2"
}
}
标签:总结,index,reindex,dest,ip,source,Elasticsearch,port 来源: https://blog.csdn.net/weixin_50665144/article/details/115392151