全文搜索中的Mysql德语口音不敏感搜索
作者:互联网
让我们看一个酒店表示例:
CREATE TABLE `hotels` (
`HotelNo` varchar(4) character set latin1 NOT NULL default '0000',
`Hotel` varchar(80) character set latin1 NOT NULL default '',
`City` varchar(100) character set latin1 default NULL,
`CityFR` varchar(100) character set latin1 default NULL,
`Region` varchar(50) character set latin1 default NULL,
`RegionFR` varchar(100) character set latin1 default NULL,
`Country` varchar(50) character set latin1 default NULL,
`CountryFR` varchar(50) character set latin1 default NULL,
`HotelText` text character set latin1,
`HotelTextFR` text character set latin1,
`tagsforsearch` text character set latin1,
`tagsforsearchFR` text character set latin1,
PRIMARY KEY (`HotelNo`),
FULLTEXT KEY `fulltextHotelSearch` (`HotelNo`,`Hotel`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`,`HotelText`,`HotelTextFR`,`tagsforsearch`,`tagsforsearchFR`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_german1_ci;
例如,在此表中,我们只有一家酒店,其地区名称=“Graubünden”(请注意umlautü字符)
现在,我想对短语进行相同的搜索匹配:
“ graubunden”和
“graubünden”
使用内置的MySql很简单
常规搜索中的排序规则如下:
SELECT *
FROM `hotels`
WHERE `Region` LIKE CONVERT(_utf8 '%graubunden%' USING latin1)
COLLATE latin1_german1_ci
这对于“ graubunden”和“graubünden”工作正常,并且
结果我收到了适当的结果,但是问题是
当我们进行MySQL全文搜索时
此SQL语句有什么问题?:
SELECT
*
FROM
hotels
WHERE
MATCH (`HotelNo`,`Hotel`,`Address`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`, `HotelText`, `HotelTextFR`, `tagsforsearch`, `tagsforsearchFR`)
AGAINST( CONVERT('+graubunden' USING latin1) COLLATE latin1_german1_ci IN BOOLEAN MODE)
ORDER BY Country ASC, Region ASC, City ASC
这不会返回任何结果.
有什么想法把狗埋在哪里吗?
解决方法:
为列定义单个CHARACTER SETS时,将覆盖在表级别设置为默认的排序规则.
您的每个列都有默认的latin1排序规则(即latin1_swedish_ci).您可以通过运行SHOW CREATE TABLE看到它.
在FULLTEXT查询中,索引列的COERCIBILITY为0,也就是说,所有全文查询都将转换为索引中使用的排序规则,反之亦然.
您需要从列中删除CHARACTER SET定义,或将所有列显式设置为latin1_german_ci:
CREATE TABLE `hotels` (
`HotelNo` varchar(4) NOT NULL default '0000',
`Hotel` varchar(80) NOT NULL default '',
`City` varchar(100) default NULL,
`CityFR` varchar(100) default NULL,
`Region` varchar(50) default NULL,
`RegionFR` varchar(100) default NULL,
`Country` varchar(50) default NULL,
`CountryFR` varchar(50) default NULL,
`HotelText` text,
`HotelTextFR` text,
`tagsforsearch` text,
`tagsforsearchFR` text,
PRIMARY KEY (`HotelNo`),
FULLTEXT KEY `fulltextHotelSearch` (`HotelNo`,`Hotel`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`,`HotelText`,`HotelTextFR`,`tagsforsearch`,`tagsforsearchFR`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_german1_ci;
INSERT
INTO hotels (hotelText, HotelTextFR, tagsforsearch, tagsforsearchFR)
VALUES ('text', 'text', 'graubünden', 'tags');
SELECT *
FROM hotels
WHERE MATCH (`HotelNo`,`Hotel`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`, `HotelText`, `HotelTextFR`, `tagsforsearch`, `tagsforsearchFR`)
AGAINST (CONVERT('+graubunden' USING latin1) COLLATE latin1_german1_ci IN BOOLEAN MODE)
ORDER BY
Country ASC, Region ASC, City ASC;
标签:encoding,collation,diacritics,full-text-search,mysql 来源: https://codeday.me/bug/20191210/2099231.html