其他分享
首页 > 其他分享> > scipy.polyfit(x,y,100)将是100阶polynome,但是matplotlib.pyplot.legend显示53吗?

scipy.polyfit(x,y,100)将是100阶polynome,但是matplotlib.pyplot.legend显示53吗?

作者:互联网

我很难弄清楚为什么我的plt.legend显示错误的多项式学位.它说53而不是100.我的代码将如下所示:

import scipy as sp
import numpy as np
import urllib2
import matplotlib.pyplot as plt

url = 'https://raw.github.com/luispedro/BuildingMachineLearningSystemsWithPython/master/ch01/data/web_traffic.tsv'
src = urllib2.urlopen(url)
data = np.genfromtxt(src)

x = data[:, 0]
y = data[:, 1]
x = x[~sp.isnan(y)] 
y = y[~sp.isnan(y)] 

def error(f, a, b):
    return sp.sum((f(a) - b) ** 2)

fp100 = sp.polyfit(x, y, 100)
f100 = sp.poly1d(fp100)
plt.plot(x, f100(x), linewidth=4)
plt.legend("d={num}".format(num=f100.order), loc=2)
plt.show()

解决方法:

我可以复制您的数据:

>>> np.__version__
1.8.0
>>> fp100 = sp.polyfit(x, y, 100)
polynomial.py:587: RankWarning: Polyfit may be poorly conditioned
  warnings.warn(msg, RankWarning)
>>> f100 = sp.poly1d(fp100)
>>> f100.order
53

注意警告,并咨询the docs

polyfit issues a RankWarning when the least-squares fit is badly conditioned. This implies that the best fit is not well-defined due to numerical error. The results may be improved by lowering the polynomial degree or by replacing x by x – x.mean()

您的y具有较低的方差:

>>> y.mean()
1961.7438692098092
>>> y.std()
860.64491521872196

因此,人们不会期望更高的政策能够很好地适应它.请注意,按照docs的建议,将x替换为x-x.mean()后,它的近似程度由较低等级的政治经济学得出,并不比较高等级的政治经济学差:

>>> xp=x-x.mean()
>>> f100 = sp.poly1d(sp.polyfit(xp, y,100))
>>> max(abs(f100(xp)-y)/y)
2.1173504721727299
>>> abs((f100(xp)-y)/y).mean()
0.18100985148093593

>>> f4 = sp.poly1d(sp.polyfit(xp, y, 4))
>>> max(abs(f4(xp)-y)/y)
2.1228866902203842
>>> abs((f4(xp)-y)/y).mean()
0.20139219654066282

>>> print f4
           4             3             2
8.827e-08 x + 3.161e-05 x + 0.0003102 x + 0.06247 x + 1621

实际上,最重要的部分似乎具有2级.因此,正常情况下,最大近似不大于100的数据策略实际上是53级.所有更高的单项式都是退化的.下面是表示近似值的图片,红线对应于4级的政策,绿色对应于53级的一个:

标签:matplotlib,curve-fitting,python,numpy
来源: https://codeday.me/bug/20191122/2058743.html