首页 > 编程语言> > python—wordcloud库绘制词云2

python—wordcloud库绘制词云2

2021-01-28 13:00:05 作者：互联网

文章目录

前言
一、wordcloud基础
二、wordcloud简单词云图
三、更改颜色

前言

wordcloud是python的一个三方库,称为词云也叫做文字云,是根据文本中的词频,对内容进行可视化的汇总.安装wordcloud: pip install wordcloud

一、wordcloud基础

WordCloud()的默认值：

WordCloud(font_path=None, width=400, height=200, margin=2, ranks_only=None, 
prefer_horizontal=.9, mask=None, scale=1, color_func=None, max_words=200,
min_font_size=4, stopwords=None, random_state=None, 
background_color='black', max_font_size=None, font_step=1, mode="RGB",
relative_scaling='auto', regexp=None, collocations=True, colormap=None,
normalize_plurals=True, contour_width=0, 
contour_color='black', repeat=False,include_numbers=False, 
min_word_length=0, collocation_threshold=30)

属性名	示例	说明
background_color	background_color=‘white’	指定背景色,可以使用16进制颜色
width	width=600	画布宽度，图像长度默认400 单位像素
height	height=400	画布高度，图像高度默认200
margin	margin=20	词与词之间的边距默认2
scale	scale=0.5	缩放比例对图像整体进行缩放默认为1
prefer_horizontal	prefer_horizontal=0.9	词在水平方向上出现的频率,默认为0.9
min_font_size	min_font_size=10	最小字体默认为4
max_font_size	max_font_size=20	最大字体默认为200
max_words	max_words=200	词云显示的最大词语数量默认为200
font_step	font_step=2	字体步幅控制在给定text遍历单词的步幅默认为1 一般不用修改对于较大text 增大font_step会加快读取速度但会牺牲部分准确性
stopwords	stopwords=set(‘dog’)	设置要过滤的词以字符串或者集合作为接收参数如不设置将使用默认的停动词词库
mode	mode=‘RGB’	设置显色模式默认RGB 如果为RGBA且background_color不为空时，背景为透明
relative_scaling	relative_scaling=1	词频与字体大小关联性默认为5 值越小变化越明显
color_func	color_func=None	生成新颜色的函数如果为空则使用 self.color_func
regexp	regexp=None	默认单词是以空格分割,如果设置这个参数将根据指定函数来分割
collocations	collocations=False	是否包含两个词的搭配默认为True
colormap	colormap=None	给所有单词随机分配颜色指定color_func则忽略
random_state	random_state=1	为每个单词返回一个PIL颜色
font_path	font_path=‘PangMenZhengDaoBiaoTiTi-1.ttf’	指定字体
mask	mask=None	指定背景图,会将单词填充在背景图像素非白色(#FFFFFF RGB(255,255,255))的地方

二、wordcloud简单词云图

import jieba#分词库
import pandas as pd
import matplotlib.pyplot as plt
from wordcloud import WordCloud#词云库
import numpy as np
from collections import Counter#统计词频
import PIL#对图片进行操作

df = pd.read_excel(r'D:\python学习\评论.xlsx')#读取文本

#构建函数来删除停用词、自定义关键词、自定义听用词
def get_cut_words(content_series):
    # 读入停用词表
    stop_words = [] #构建一个空的列表存放听用词
    #open中的“r”表示的只读取
    with open(r"D:\python学习\\chineseStopWords.txt", 'r') as f:
        lines = f.readlines()#一行一行的读取
        for line in lines:
            stop_words.append(line.strip())#line.strip删除词两端的空格
    # 添加关键词
    #my_words = []    
    #for i in my_words:
        #jieba.add_word(i) 
    # 自定义停用词
    my_stop_words = ['快递', '收到']   
    stop_words.extend(my_stop_words)               
    # 分词
    content=';'.join([ str(c) for c in content_series.tolist()])# str(c)转变为字符串
    word_num = jieba.lcut(content)
    # 条件筛选
    word_num_selected = [i for i in word_num if i not in stop_words and len(i)>=2]
    return word_num_selected
    
#利用上面构建的函数对导入的文本进行处理
text1 = get_cut_words(content_series=df['评论'])

c = Counter(text1) #对处理好的文本统计频率
common_c = c.most_common(300)#统计频率后前300的词语
common_c

mk = np.array(PIL.Image.open(r'D:\python学习\zihaowordcloud\code\图片\wujiaoxing.png'))#导入图片

#设定词云图
wc = WordCloud(
            # 设置字体
            font_path = 'C:/Windows/Fonts/simhei.ttf',#必须加中文字体，否则格式错误
            # 设置背景色
            background_color='white',
            scale=1,  # 数值越大，图片越清晰，但是太大电脑可能会吃不消
            # 词云形状
            mask=mk,
            width=900, height=600,
            #max_words=300,            # 词云显示的最大词语数量
            max_font_size=60,         # 设置字体最大值
            min_font_size=3,         # 设置子图最小值
            random_state=50           # 设置随机生成状态，即多少种配色方案
            )

# 生成词云
wc.generate_from_frequencies(dict(common_c))
# 生成图片并显示
plt.imshow(wc)
plt.axis('off')
plt.show()
# 保存图片
wc.to_file(r'D:\python学习\zihaowordcloud\code\pic.jpg')

结果如下：
在这里插入图片描述

三、更改颜色

这里使用:colormap : string or matplotlib colormap, default=”viridis”
matplotlib 色图，可更改名称进而更改整体风格Matplotlib色彩映射表为每个单词随机绘制颜色。如果指定了“color_func”，则忽略。
颜色代码：colormap = 'Blues'
‘Accent’, ‘Accent_r’, ‘Blues’, ‘Blues_r’, ‘BrBG’, ‘BrBG_r’, ‘BuGn’, ‘BuGn_r’, ‘BuPu’, ‘BuPu_r’, ‘CMRmap’, ‘CMRmap_r’, ‘Dark2’, ‘Dark2_r’, ‘GnBu’, ‘GnBu_r’, ‘Greens’, ‘Greens_r’, ‘Greys’, ‘Greys_r’, ‘OrRd’, ‘OrRd_r’, ‘Oranges’, ‘Oranges_r’, ‘PRGn’, ‘PRGn_r’, ‘Paired’, ‘Paired_r’, ‘Pastel1’, ‘Pastel1_r’, ‘Pastel2’, ‘Pastel2_r’, ‘PiYG’, ‘PiYG_r’, ‘PuBu’, ‘PuBuGn’, ‘PuBuGn_r’, ‘PuBu_r’, ‘PuOr’, ‘PuOr_r’, ‘PuRd’, ‘PuRd_r’, ‘Purples’, ‘Purples_r’, ‘RdBu’, ‘RdBu_r’, ‘RdGy’, ‘RdGy_r’, ‘RdPu’, ‘RdPu_r’, ‘RdYlBu’, ‘RdYlBu_r’, ‘RdYlGn’, ‘RdYlGn_r’, ‘Reds’, ‘Reds_r’, ‘Set1’, ‘Set1_r’, ‘Set2’, ‘Set2_r’, ‘Set3’, ‘Set3_r’, ‘Spectral’, ‘Spectral_r’, ‘Wistia’, ‘Wistia_r’, ‘YlGn’, ‘YlGnBu’, ‘YlGnBu_r’, ‘YlGn_r’, ‘YlOrBr’, ‘YlOrBr_r’, ‘YlOrRd’, ‘YlOrRd_r’, ‘afmhot’, ‘afmhot_r’, ‘autumn’, ‘autumn_r’, ‘binary’, ‘binary_r’, ‘bone’, ‘bone_r’, ‘brg’, ‘brg_r’, ‘bwr’, ‘bwr_r’, ‘cividis’, ‘cividis_r’, ‘cool’, ‘cool_r’, ‘coolwarm’, ‘coolwarm_r’, ‘copper’, ‘copper_r’, ‘cubehelix’, ‘cubehelix_r’, ‘flag’, ‘flag_r’, ‘gist_earth’, ‘gist_earth_r’, ‘gist_gray’, ‘gist_gray_r’, ‘gist_heat’, ‘gist_heat_r’, ‘gist_ncar’, ‘gist_ncar_r’, ‘gist_rainbow’, ‘gist_rainbow_r’, ‘gist_stern’, ‘gist_stern_r’, ‘gist_yarg’, ‘gist_yarg_r’, ‘gnuplot’, ‘gnuplot2’, ‘gnuplot2_r’, ‘gnuplot_r’, ‘gray’, ‘gray_r’, ‘hot’, ‘hot_r’, ‘hsv’, ‘hsv_r’, ‘inferno’, ‘inferno_r’, ‘jet’, ‘jet_r’, ‘magma’, ‘magma_r’, ‘nipy_spectral’, ‘nipy_spectral_r’, ‘ocean’, ‘ocean_r’, ‘pink’, ‘pink_r’, ‘plasma’, ‘plasma_r’, ‘prism’, ‘prism_r’, ‘rainbow’, ‘rainbow_r’, ‘seismic’, ‘seismic_r’, ‘spring’, ‘spring_r’, ‘summer’, ‘summer_r’, ‘tab10’, ‘tab10_r’, ‘tab20’, ‘tab20_r’, ‘tab20b’, ‘tab20b_r’, ‘tab20c’, ‘tab20c_r’, ‘terrain’, ‘terrain_r’, ‘twilight’, ‘twilight_r’, ‘twilight_shifted’, ‘twilight_shifted_r’, ‘viridis’, ‘viridis_r’, ‘winter’, ‘winter_r’

在这里插入图片描述
改变颜色，只需要在wc = WordCloud(colormap = 'Blues')其他的都不变

wc = WordCloud(
            # 设置字体
            font_path = 'C:/Windows/Fonts/simhei.ttf',#必须加中文字体，否则格式错误
            # 设置背景色
            background_color='white',
            scale=3,  # 数值越大，图片越清晰，但是太大电脑可能会吃不消
            # 词云形状
            mask=mk,
            width=900, height=600,
            colormap='PuOr',
            #max_words=300,            # 词云显示的最大词语数量
            max_font_size=60,         # 设置字体最大值
            min_font_size=3,         # 设置子图最小值
            random_state=50           # 设置随机生成状态，即多少种配色方案
            )

# 生成词云
wc.generate_from_frequencies(dict(common_c))
# 生成图片并显示
plt.imshow(wc)
plt.axis('off')
plt.show()
wc.to_file(r'D:\python学习\zihaowordcloud\code\pic1.jpg')

结果如下：
在这里插入图片描述

标签：None,gist,python,默认,color,wordcloud,词云,words,font
来源： https://blog.csdn.net/Txixi/article/details/113306822