Python:将变量传递到Word中NL4K中的Synsets方法
作者:互联网
我需要处理一个需要NLTK的项目,所以我在两周前开始学习Python,但很难理解Python和NLTK.
从NLTK文档中,我可以理解以下代码,如果我在下面的代码中手动添加单词apple和pear,它们的效果很好.
from nltk.corpus import wordnet as wn
apple = wn.synset('apple.n.01')
pear = wn.synset('pear.n.01')
print apple.lch_similarity(pear)
Output: 2.53897387106
但是,我需要使用NLTK来处理项目列表.例如,我有一个下面的项目列表,我想比较list1中的项目和list2 – 例如:将list1中的word1与list 2中的每个单词进行比较,然后将list1中的word2与list2中的每个单词进行比较,直到所有单词为止list1进行了比较.
list1 = ["apple", "honey", "drinks", "flowers", "paper"]
list2 = ["pear", "shell", "movie", "fire", "tree", "candle"]
wordFromList1 = list1[0]
wordFromList2 = list2[0]
wordFromList1 = wn.synset(wordFromList1)
wordFromList2 = wn.synset(wordFromList2)
print wordFromList1.lch_similarity(wordFromList2)
上面的代码当然会出错.任何人都可以告诉我如何将变量传递给synset方法[wn.synset(* pass_variable_in_here *)],以便我可以使用双循环来获取它们的lch_similarity值.谢谢.
解决方法:
wordnet.synset需要一个3-part
name字符串的形式:
word.pos.nn.
您没有为list1和中的每个单词指定pos.nn部分
列表2.
假设所有单词都是名词似乎是合理的,所以我们可以试试
将字符串’.n.01’附加到list1和list2中的每个字符串:
for word1, word2 in IT.product(list1, list2):
wordFromList1 = wordnet.synset(word1+'.n.01')
wordFromList2 = wordnet.synset(word2+'.n.02')
但是,这不起作用. wordnet.synset(‘drinks.n.01’)引发了一个WordNetError.
另一方面,same doc
page显示你可以
使用synsets方法查找相似的单词:
例如,wordnet.synsets(‘drinks’)返回列表:
[Synset('drink.n.01'),
Synset('drink.n.02'),
Synset('beverage.n.01'),
Synset('drink.n.04'),
Synset('swallow.n.02'),
Synset('drink.v.01'),
Synset('drink.v.02'),
Synset('toast.v.02'),
Synset('drink_in.v.01'),
Synset('drink.v.05')]
所以在这一点上,你需要考虑一下你希望程序做什么.如果您可以选择此列表中的第一项作为饮料的代理,
然后你可以使用
for word1, word2 in IT.product(list1, list2):
wordFromList1 = wordnet.synsets(word1)[0]
wordFromList2 = wordnet.synsets(word2)[0]
这会导致程序看起来像这样:
import nltk.corpus as corpus
import itertools as IT
wordnet = corpus.wordnet
list1 = ["apple", "honey", "drinks", "flowers", "paper"]
list2 = ["pear", "shell", "movie", "fire", "tree", "candle"]
for word1, word2 in IT.product(list1, list2):
# print(word1, word2)
wordFromList1 = wordnet.synsets(word1)[0]
wordFromList2 = wordnet.synsets(word2)[0]
print('{w1}, {w2}: {s}'.format(
w1 = wordFromList1.name,
w2 = wordFromList2.name,
s = wordFromList1.lch_similarity(wordFromList2)))
产量
apple.n.01, pear.n.01: 2.53897387106
apple.n.01, shell.n.01: 1.07263680226
apple.n.01, movie.n.01: 1.15267950994
apple.n.01, fire.n.01: 1.07263680226
...
标签:wordnet,python,nltk 来源: https://codeday.me/bug/20190901/1782578.html