如何将statsmodels中的OLS应用于groupby
作者:互联网
我按月在产品上运行OLS.虽然这对于单个产品来说很好用,但我的数据框包含许多产品.如果我创建一个groupby对象,则OLS会给出错误.
linear_regression_df:
product_desc period_num TOTALS
0 product_a 1 53
3 product_a 2 52
6 product_a 3 50
1 product_b 1 44
4 product_b 2 43
7 product_b 3 41
2 product_c 1 36
5 product_c 2 35
8 product_c 3 34
from pandas import DataFrame, Series
import statsmodels.api as sm
linear_regression_grouped = linear_regression_df.groupby(['product_desc'])
X = linear_regression_grouped['period_num']
y = linear_regression_grouped['TOTALS']
model = sm.OLS(y, X)
results = model.fit()
我在sm.OLS()行上收到此错误:
ValueError: unrecognized data structures: <class 'pandas.core.groupby.SeriesGroupBy'>
那么,如何浏览数据框并为每个product_desc应用sm.OLS()?
解决方法:
你可以做这样的事情…
import pandas as pd
import statsmodels.api as sm
for products in linear_regression_df.product_desc.unique():
tempdf = linear_regression_df[linear_regression_df.product_desc == products]
X = tempdf['period_num']
y = tempdf['TOTALS']
model = sm.OLS(y, X)
results = model.fit()
print results.params # Or whatever summary info you want
标签:statsmodels,python,pandas 来源: https://codeday.me/bug/20191009/1878718.html