python – SequenceMatcher.ratio如何在difflib中工作
作者:互联网
我正在尝试python的difflib模块,我遇到了SequenceMatcher.所以,我尝试了以下示例,但无法理解发生了什么.
>>> SequenceMatcher(None,"abc","a").ratio()
0.5
>>> SequenceMatcher(None,"aabc","a").ratio()
0.4
>>> SequenceMatcher(None,"aabc","aa").ratio()
0.6666666666666666
现在,根据ratio:
Return a measure of the sequences’ similarity as a float in the range
[0, 1]. WhereT
is the total number of elements in both sequences, and
M
is the number of matches, this is2.0*M / T
.
所以,对于我的情况:
> T = 4且M = 1,因此比率2 * 1/4 = 0.5
> T = 5且M = 2,因此比率2 * 2/5 = 0.8
> T = 6且M = 1,因此比率2 * 1 / 6.0 = 0.33
根据我的理解T = len(aabc)len(a)和M = 2,因为a在aabc中出现两次.
那么,我错在哪里,我错过了什么.
Here是SequenceMatcher.ratio()的源代码
解决方法:
你有第一个案例是正确的.在第二种情况下,只有一个来自aabc匹配,所以M = 1.在第三个例子中,两者都匹配,因此M = 2.
[P.S.:你指的是古老的Python 2.4源代码.目前的源代码是hg.python.org.]
标签:string-matching,python,string,similarity 来源: https://codeday.me/bug/20190901/1785756.html