其他分享
首页 > 其他分享> > 机器学习实战2:KNN决策树探究泰坦尼克号幸存者问题

机器学习实战2:KNN决策树探究泰坦尼克号幸存者问题

作者:互联网

KNN决策树解决泰坦尼克

在这里插入图片描述

import pandas as pd
from sklearn.tree import DecisionTreeClassifier, export_graphviz
from sklearn.metrics import classification_report
import graphviz   #决策树可视化
data = pd.read_csv(r"titanic_data.csv")
data.drop("PassengerId",axis = 1,inplace = True)  #删除id这一列
data
SurvivedPclassSexAge
003male22.0
111female38.0
213female26.0
311female35.0
403male35.0
...............
88602male27.0
88711female19.0
88803femaleNaN
88911male26.0
89003male32.0

891 rows × 4 columns

data.loc[data["Sex"] == "male","Sex"] = 1
data.loc[data["Sex"] == "female","Sex"] = 0
data
SurvivedPclassSexAge
003122.0
111038.0
213026.0
311035.0
403135.0
...............
88602127.0
88711019.0
888030NaN
88911126.0
89003132.0

891 rows × 4 columns

data.fillna(data["Age"].mean(),inplace = True)  #用均值来填充缺失值
data
SurvivedPclassSexAge
003122.000000
111038.000000
213026.000000
311035.000000
403135.000000
...............
88602127.000000
88711019.000000
88803029.699118
88911126.000000
89003132.000000

891 rows × 4 columns

Dtc = DecisionTreeClassifier(max_depth = 5,random_state =8)  #构建决策树
Dtc.fit(data.iloc[:,1:],data["Survived"])    #模型训练
pre = Dtc.predict(data.iloc[:,1:])  #模型预测
print(classification_report(pre,data["Survived"]))   #混淆矩阵
              precision    recall  f1-score   support

           0       0.88      0.84      0.86       573
           1       0.73      0.79      0.76       318

    accuracy                           0.82       891
   macro avg       0.81      0.82      0.81       891
weighted avg       0.83      0.82      0.82       891
pre == data["Survived"]   #比较模型预测值与实际值是否一致
0       True
1       True
2       True
3       True
4       True
       ...  
886     True
887     True
888    False
889    False
890     True
Name: Survived, Length: 891, dtype: bool

可视化

dot_data = export_graphviz(Dtc,feature_names = ["Pclass","Sex","Age"],class_names="Survive")
graph  = graphviz.Source(dot_data)
graph

在这里插入图片描述

标签:KNN,泰坦尼克号,891,data,Sex,Survived,graphviz,True,决策树
来源: https://blog.csdn.net/qq_45176548/article/details/112060492