python-如何过滤特定列在熊猫中创建的交叉表
作者:互联网
我使用以下命令在熊猫中创建了交叉表:
grouped_missing_analysis = pd.crosstab(clean_sessions.action_type, clean_sessions.action, margins=True).unstack()
print(grouped_missing_analysis[:20])
导致显示:
action action_type
10 Missing 0
Unknown 0
booking_request 0
booking_response 0
click 0
data 0
message_post 3215
modify 0
partner_callback 0
submit 0
view 0
All 3215
11 Missing 0
Unknown 0
booking_request 0
booking_response 0
click 0
data 0
message_post 716
modify 0
dtype: int64
我只想显示为“未知”,“缺少”或“其他”的action_type,并忽略每个动作的其他action_type.我觉得答案与之有关:
.where(clean_sessions.action_type.isin(('Missing', 'Unknown')), 'Other')
从上一个片段中可以找到,但无法正常工作.也许pivot_table会更容易一些,但是本练习仅对我了解如何在python中使用不同功能进行数据分析.
clean_sessions的原始数据如下所示:
user_id action action_type action_detail \
0 d1mm9tcy42 lookup Missing Missing
1 d1mm9tcy42 search_results click view_search_results
2 d1mm9tcy42 lookup Missing Missing
3 d1mm9tcy42 search_results click view_search_results
4 d1mm9tcy42 lookup Missing Missing
5 d1mm9tcy42 search_results click view_search_results
6 d1mm9tcy42 lookup Missing Missing
7 d1mm9tcy42 personalize data wishlist_content_update
8 d1mm9tcy42 index view view_search_results
9 d1mm9tcy42 lookup Missing Missing
device_type secs_elapsed
0 Windows Desktop 319
1 Windows Desktop 67753
2 Windows Desktop 301
3 Windows Desktop 22141
4 Windows Desktop 435
5 Windows Desktop 7703
6 Windows Desktop 115
7 Windows Desktop 831
8 Windows Desktop 20842
9 Windows Desktop 683
解决方法:
这些是您的索引,而不是列,您需要传递标签以选择感兴趣的行.
您可以在第一级传递slice(None),然后在第二级传递列表:
In [102]:
grouped_missing_analysis.loc[slice(None), ['Missing', 'Unknown', 'Other']]
Out[102]:
action action_type
index Missing 0
lookup Missing 5
personalize Missing 0
search_results Missing 0
All Missing 5
dtype: int64
docs提供了有关这种索引样式的更多详细信息
标签:crosstab,pandas,python,group-by 来源: https://codeday.me/bug/20191027/1943724.html