python – 将Dataframe与不同日期合并?
作者:互联网
我想将单独的数据帧(df2)与主数据帧(df1)合并,但是如果对于给定的行,df1中的日期不存在于df2中,则在df1中搜索基础日期之前的最近日期.
我尝试使用pd.merge,但它会删除具有不匹配日期的行,并且只保留两个df中匹配的行.
df1 = [['2007-01-01','A'],
['2007-01-02','B'],
['2007-01-03','C'],
['2007-01-04','B'],
['2007-01-06','C']]
df2 = [['2007-01-01','B',3],
['2007-01-02','A',4],
['2007-01-03','B',5],
['2007-01-06','C',3]]
df1 = pd.DataFrame(df1)
df2 = pd.DataFrame(df2)
df1[0] = pd.to_datetime(df1[0])
df2[0] = pd.to_datetime(df2[0])
目前的df1 | pd.merge():
0 1 2
0 2007-01-06 C 3
只获取两个df之间的确切日期,它不考虑最近日期的值.
预计df1:
0 1 2
0 2007-01-01 A NaN
1 2007-01-02 B 3
2 2007-01-03 C NaN
3 2007-01-04 B 3
4 2007-01-06 C 3
获取NaN,因为df2中该日期或之前的数据不存在.对于索引行1,它在前一天之前获取数据,而在索引行4中,它在同一天准确获取数据.
解决方法:
使用merge_asof检查输出
pd.merge_asof(df1,df2,on=0,by=1,allow_exact_matches=True)
Out[15]:
0 1 2
0 2007-01-01 A NaN
1 2007-01-02 B 3.0
2 2007-01-03 C NaN
3 2007-01-04 B 5.0 # here should be 5 since 5 ' date is more close. also df2 have two B
4 2007-01-06 C 3.0
标签:python,dataframe,pandas,data-manipulation 来源: https://codeday.me/bug/20190701/1345835.html