dropna 缺失数据处理
作者:互联网
- 函数原型
DataFrame.dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)
- 参数意义
-
axis
{0 or ‘index’, 1 or ‘columns’}, default 0- Determine if rows or columns which contain missing values are removed.
- 0, or ‘index’ : Drop rows which contain missing values.
- 1, or ‘columns’ : Drop columns which contain missing value.
- Changed in version 1.0.0: Pass tuple or list to drop on multiple axes. Only a single axis is allowed.
- Determine if rows or columns which contain missing values are removed.
-
how
{‘any’, ‘all’}, default ‘any’- Determine if row or column is removed from DataFrame, when we have at least one NA or all NA.
- ‘any’ : If any NA values are present, drop that row or column.
- ‘all’ : If all values are NA, drop that row or column.
- Determine if row or column is removed from DataFrame, when we have at least one NA or all NA.
-
threshint
, optional- Require that many non-NA values.
-
subsetarray-like
, optional- Labels along other axis to consider, e.g. if you are dropping rows these would be a list of columns to include.
-
inplacebool
, default False- If True, do operation inplace and return None.
-
Returns
- DataFrame or None
- DataFrame with NA entries dropped from it or None if inplace=True.
- 样例
df = pd.DataFrame({"name": ['Alfred', 'Batman', 'Catwoman'],
"toy": [np.nan, 'Batmobile', 'Bullwhip'],
"born": [pd.NaT, pd.Timestamp("1940-04-25"),
pd.NaT]})
name toy born
0 Alfred NaN NaT
1 Batman Batmobile 1940-04-25
2 Catwoman Bullwhip NaT
默认删除
df.dropna()
name toy born
1 Batman Batmobile 1940-04-25
删除所有存在NAN值的列
df.dropna(axis='columns')
name
0 Alfred
1 Batman
2 Catwoman
删除所有列都为空的行
df.dropna(how='all')
name toy born
0 Alfred NaN NaT
1 Batman Batmobile 1940-04-25
2 Catwoman Bullwhip NaT
删除空值大于2的列
df.dropna(thresh=2)
name toy born
1 Batman Batmobile 1940-04-25
2 Catwoman Bullwhip NaT
删除name,toy列为空的行
df.dropna(subset=['name', 'toy'])
name toy born
1 Batman Batmobile 1940-04-25
2 Catwoman Bullwhip NaT
df.dropna(inplace=True)
name toy born
1 Batman Batmobile 1940-04-25
标签:25,toy,name,Batman,dropna,NaT,数据处理,缺失 来源: https://blog.csdn.net/weixin_43745072/article/details/112969660