编程语言
首页 > 编程语言> > python – SequenceMatcher.ratio如何在difflib中工作

python – SequenceMatcher.ratio如何在difflib中工作

作者:互联网

我正在尝试python的difflib模块,我遇到了SequenceMatcher.所以,我尝试了以下示例,但无法理解发生了什么.

>>> SequenceMatcher(None,"abc","a").ratio()
0.5

>>> SequenceMatcher(None,"aabc","a").ratio()
0.4

>>> SequenceMatcher(None,"aabc","aa").ratio()
0.6666666666666666

现在,根据ratio

Return a measure of the sequences’ similarity as a float in the range
[0, 1]. Where T is the total number of elements in both sequences, and
M is the number of matches, this is 2.0*M / T.

所以,对于我的情况:

> T = 4且M = 1,因此比率2 * 1/4 = 0.5
> T = 5且M = 2,因此比率2 * 2/5 = 0.8
> T = 6且M = 1,因此比率2 * 1 / 6.0 = 0.33

根据我的理解T = len(aabc)len(a)和M = 2,因为a在aabc中出现两次.

那么,我错在哪里,我错过了什么.

Here是SequenceMatcher.ratio()的源代码

解决方法:

你有第一个案例是正确的.在第二种情况下,只有一个来自aabc匹配,所以M = 1.在第三个例子中,两者都匹配,因此M = 2.

[P.S.:你指的是古老的Python 2.4源代码.目前的源代码是hg.python.org.]

标签:string-matching,python,string,similarity
来源: https://codeday.me/bug/20190901/1785756.html