其他分享
首页 > 其他分享> > 从文件中统计中文词语出现的频次

从文件中统计中文词语出现的频次

作者:互联网

 1 import jieba
 2 
 3 with open('红楼梦.txt', 'r', encoding='utf-8') as f:
 4     txt = f.read()
 5 
 6 ls = jieba.lcut(txt)
 7 d = {}
 8 for w in ls:
 9     d[w] = d.get(w, 0) + 1
10 
11 for k in d:
12     if d[k] >= 200 and len(k) >= 2:
13         print(f'"{k}"出现了"{d[k]}"次')

 

标签:lcut,jieba,encoding,词语,频次,从文件,ls,txt,open
来源: https://www.cnblogs.com/waterr/p/14801163.html