Solr全量索引、增量索引和定时增量索引配置
作者:互联网
一、全量索引
在solr_home\solr\core0\conf\solrconfig.xml文件中增加
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</requestHandler>
在solr_home\solr\core0\conf目录下新建data-config.xml,添加:
<?xml version="1.0" encoding="UTF-8"?>
<dataConfig>
<dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" url="jdbc:sqlserver://127.0.0.1;DatabaseName=db" user="sa" password="111111"/>
<document name="Info">
<entity name="news" transformer="ClobTransformer" pk="id"
query="select id,title from [news]"
deltaImportQuery="select id,title from [news]">
<field column="id" name="id"/>
<field column="title" name="title"/>
</entity>
</document>
</dataConfig>
点击
进行全量进行索引
二、增量索引
在solr_home\solr\new_core\conf\solrconfig.xml文件中增加
<requestHandler name="/deltaimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">delta-data-config.xml</str>
</lst>
</requestHandler>
在solr_home\solr\new_core\conf目录下新建delta-data-config.xml,添加:
<?xml version="1.0" encoding="UTF-8"?>
<dataConfig>
<dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" url="jdbc:sqlserver://127.0.0.1;DatabaseName=db" user="sa" password="111111"/>
<document name="Info">
<entity name="zpxx1" transformer="RegexTransformer" pk="id"
query="select id,title from [news]"
deltaImportQuery="select id,title from [news] where id = '${dih.delta.id}'"
deltaQuery="SELECT id FROM [news] where [UpdateTime] > '${dih.last_index_time}'">
<field column="id" name="id"/>
<field column="title" name="title"/>
</entity>
</document>
</dataConfig>
${dih.delta.id} 和 ${dih.last_index_time} 是内置函数。在 dataimport.properties 中会记录id和最后添加索引的时间
#Mon Dec 12 17:16:29 CST 2016
last_index_time=2016-12-12 17:16:29
user.last_index_time=2016-12-12 17:16:29
点击操作增量索引
三、定时增量索引
将dataimport.properties【配置文件】放到solr_home/conf目录下,自己创建一下conf目录
#################################################
# #
# dataimport scheduler properties #
# #
#################################################
# to sync or not to sync
# 1 - active; anything else - inactive
syncEnabled=1
# which cores to schedule
# in a multi-core environment you can decide which cores you want syncronized
# leave empty or comment it out if using single-core deployment
# 修改成你所使用的core
syncCores=core0
# solr server name or IP address
# [defaults to localhost if empty]
server=localhost
# solr server port
# [defaults to 80 if empty]
port=8080
# application name/context
# [defaults to current ServletContextListener's context (app) name]
webapp=solr
# URL params [mandatory]
# remainder of URL
# 增量URL、参数
params=/deltaimport?command=delta-import&clean=false&commit=true
# schedule interval
# number of minutes between two runs
# [defaults to 30 if empty]
# 调度执行时间(分钟)
interval=1
# 重做索引的时间间隔,单位分钟,默认7200,即5天;
# 为空,为0,或者注释掉:表示永不重做索引
reBuildIndexInterval=0
# 重做索引的参数
reBuildIndexParams=/dataimport?command=full-import&clean=true&commit=true
# 重做索引时间间隔的计时开始时间,第一次真正执行的时间 = reBuildIndexBeginTime + reBuildIndexInterval * 60 * 1000;
# 两种格式:2016-012-11 14:10:00 或者 03:10:00,后一种会自动补全日期部分为服务启动时的日期
reBuildIndexBeginTime=14:05:00
修改tomcat下solr中WEB-INF/web.xml, 在servlet节点前增加:
<listener>
<listener-class>org.apache.solr.handler.dataimport.scheduler.ApplicationListener</listener-class>
</listener>
下载 solr-data-import-scheduler-1.1.2.jar
https://pan.baidu.com/s/1whxYyI6nGzHvEtsTCZH4pw
放入 \apache-tomcat-9.0.16\webapps\solr\WEB-INF\lib
标签:xml,dataimport,索引,全量,conf,增量,home,solr 来源: https://blog.csdn.net/shua67/article/details/111643454