首页 > 其他分享> > 使用sklearn建立决策树模型代码

使用sklearn建立决策树模型代码

2021-09-19 11:07:18 作者：互联网

使用sklearn训练模型并预测结果

定义数据

def loaddata():
    dataSet = [[0, 0,0,0, 'no'],
               [0, 0,0,1,'no'],
               [0, 1,0,1, 'yes'],
               [0, 1,1,0, 'yes'],
               [0, 0,0,0, 'no'],
               [1, 0,0,0, 'no'],
               [1, 0,0,1, 'no'],
               [1, 1,1,1, 'yes'],
               [1, 0,1,2, 'yes'],
               [1, 0,1,2, 'yes'],
               [2, 0,1,2, 'yes'],
               [2, 0,1,1, 'yes'],
               [2, 1,0,1, 'yes'],
               [2, 1,0,2, 'yes'],
               [2, 0,0,0,'no']]
    feature_name = ['age','job','house','credit']
    return dataSet, feature_name

myDat,feature_name = loaddata()

训练模型并预测

from sklearn import tree
import numpy as np
# 先定义X矩阵和y向量
X = np.array(myDat)[:,0:4]
y = np.array(myDat)[:,-1]
# 定义决策树类
model = tree.DecisionTreeClassifier()
# 喂数据、训练
model.fit(X,y)
# 预测
print(model.predict([[1,1,0,1]]))

sklearn对决策树实现了DecisionTreeClassifier方法，具体参数如下:

参数	填充类型	默认值	说明
criterion	“gini”、“entropy”	“gini”	可选基尼指数和信息增益
max_depth	int	None	树的最大深度，如果不设置，结点会一直分裂至结点纯度为1或者结点样本少于min_samples_split
min_samples_split	int、float	2	达到最小样本数才准许分裂
min_samples_leaf	int、float	1	叶结点最少样本数，若小于该值，则和兄弟结点一起被剪枝
min_impurity_decrease	float	0	信息增益/基尼指数必须大于阈值才能分裂

绘图

配置graphviz

from sklearn.tree import export_graphviz
import graphviz

安装环境需要

1、下载graphviz安装包：http://www.graphviz.org/download/
2、在python中安装graphviz库，pip install graphviz

产生dot文件

export_graphviz(
    model,
    out_file="./tree.dot",
    feature_names=feature_name,
    class_names=['yes','no'],
    rounded=True,
    filled=True
)

根据dot文件绘图

with open("./tree.dot") as f:
    dot_grapth = f.read()
dot = graphviz.Source(dot_grapth)
dot.view()

代码实操

以上介绍了sklearn的简易用法，代码实例可以参考使用决策树进行个人信用风险评估这篇文章。

标签：no,代码,tree,graphviz,yes,sklearn,dot,决策树
来源： https://blog.csdn.net/weixin_43915107/article/details/120378305