Coursera机器学习编程作业Python实现(Andrew Ng)—— 1.2 Linear regression with multiple variables
作者:互联网
1.2 Linear regression with multiple variables
import numpy as np import pandas as pd import matplotlib.pyplot as plt
数据读取
data2 = pd.read_csv('ex1data2.txt', sep=',', header=None, names=['size', 'bedrooms', 'price'])
数据预处理
data2.iloc[:,:-1] = (data2.iloc[:,:-1] - data2.iloc[:,:-1].mean())/data2.iloc[:,:-1].std() data2.insert(0, 'ones', 1) X = data2.values[:,:-1] y = data2.values[:,-1] y = y.reshape((-1,1))
定义假设函数
def h(X, theta): return np.dot(X, theta)
定义代价函数
def computeCost(X, theta, y): return 0.5 * np.mean(np.square(h(X, theta) - y))
定义梯度下降函数
def gradientDescent(X, theta, y, iterations, alpha): Cost = [] Cost.append(computeCost(X, theta, y)) grad = np.zeros(len(theta)) for _ in range(iterations): for j in range(len(theta)): grad[j] = np.mean((h(X, theta) - y) * (X[:,j].reshape([len(X), 1]))) for k in range(len(theta)): theta[k] = theta[k] - alpha * grad[k] Cost.append(computeCost(X, theta, y)) return theta, Cost
参数初始化
iterations = 200 lr = [1, 0.3, 0.1, 0.03, 0.01]
_,ax = plt.subplots(figsize=(10,6)) for l in lr: theta = np.zeros((X.shape[1], 1)) _, Cost = gradientDescent(X, theta, y, iterations, l) ax.plot(Cost, label='lr=%.2f'%(l)) ax.set_xlabel('iterations') ax.set_ylabel('Cost') ax.legend() plt.show()
theta = np.zeros((X.shape[1], 1)) theta_result, Cost_result = gradientDescent(X, theta, y, iterations, 0.3) theta_result
array([[340412.65957447], [110631.05027879], [ -6649.47427076]])
正规方程
theta_ref = np.linalg.inv(X.T @ X) @ X.T @ y
theta_re
array([[340412.65957447], [110631.05027885], [ -6649.47427082]])
梯度下降和正规方程求出来的解非常接近。
标签:Linear,1.2,Python,Cost,iterations,ax,np,theta,data2 来源: https://www.cnblogs.com/shouzhenghouchuqi/p/10586029.html