python-熊猫-从每个用户检索先前的结果/行
作者:互联网
我是熊猫新手.
我有一个看起来像这样的数据框(只有更大):
Horses RaceDate Position
1 RedHorse 1/2/00 2
2 BlueHorse 1/2/00 6
3 YellowHorse 1/2/00 7
4 RedHorse 15/1/00 3
我想为以前的结果添加列.这样我的数据框可能最终看起来像:
Horses RaceDate Position PrevPosition
1 RedHorse 1/2/00 2 3
2 BlueHorse 1/2/00 6 -
3 YellowHorse 1/2/00 7 -
4 RedHorse 15/1/00 3 -
我尝试了以下方法:
def prevRuns(horseName, raceDate):
horseDf = df.loc[df['Horse'] == horseName]
currentRace = horseDf.index[horseDf['RaceDate'] == raceDate]
if len(horseDf.index) >= currentRace:
return horseDf.at[currentRace+1,'Position']
else:
return 0
df['prevRun'] = df['Horse'].apply(prevRuns, raceDate = df['RaceDate'])
但这是行不通的.
ValueError: Can only compare identically-labeled Series objects
为什么不起作用?
有没有更优雅的方式来实现我要完成的任务?
解决方法:
您可以使用groupby shift:
# convert dates to datetime and sort descending
df['RaceDate'] = pd.to_datetime(df['RaceDate'], dayfirst=True)
df = df.sort_values('RaceDate', ascending=False)
# groupby and shift for previous position
df['PrevPosition'] = df.groupby('Horses')['Position'].shift(-1)
print(df)
Horses RaceDate Position PrevPosition
1 RedHorse 2000-02-01 2 3.0
2 BlueHorse 2000-02-01 6 NaN
3 YellowHorse 2000-02-01 7 NaN
4 RedHorse 2000-01-15 3 NaN
标签:pandas-groupby,pandas,python 来源: https://codeday.me/bug/20191211/2106727.html