红楼梦词频统计
作者:互联网
import jieba
jieba.setLogLevel(jieba.logging.INFO)
txt=open('红楼梦.txt','r',encoding='gb18030').read()
words=jieba.lcut(txt)
counts={}
for word in words:
if len(word)==1:
continue
else:
counts[word]=counts.get(word,0)+1
items=list(counts.items())
items.sort(key=lambda x:x[1],reverse=True)
for i in range(20):
word,count=items[i]
print('{0:<10}{1:>5}'.format(word,count))
标签:count,word,items,红楼梦,词频,words,counts,txt,统计 来源: https://www.cnblogs.com/pxxxx/p/15548330.html