梯度下降三种
- 批量梯度下降
- 随机梯度下降
- 小批量梯度下降
多项式回归
- PolynomialFeatures将变量扩展多维
from sklearn.preprocessing import PolynomialFeatures
poly_features = PolynomialFeatures(degree=2, include_bias=False)
X_poly = poly_features.fit_transform(X)
X[0]
#array([-0.75275929])
X_poly[0]
#(array([1.78134581]), array([[0.93366893, 0.56456263]]))
lin_reg = LinearRegression()
lin_reg.fit(X_poly, y)
lin_reg.intercept_, lin_reg.coef_
#(array([1.78134581]), array([[0.93366893, 0.56456263]]))
#theta参数
学习曲线
- Xlable 训练集大小
- Ylabel RMSE代价损失
早期停止法
from sklearn.base import clone sgd_reg = SGDRegressor(max_iter=1, warm_start=True, penalty=None, learning_rate="constant", eta0=0.0005, random_state=42) minimum_val_error = float("inf") best_epoch = None best_model = None for epoch in range(1000): sgd_reg.fit(X_train_poly_scaled, y_train) # continues where it left off y_val_predict = sgd_reg.predict(X_val_poly_scaled) val_error = mean_squared_error(y_val, y_val_predict) if val_error < minimum_val_error: minimum_val_error = val_error best_epoch = epoch best_model = clone(sgd_reg) best_epoch, best_model #(239, SGDRegressor(alpha=0.0001, average=False, epsilon=0.1, eta0=0.0005, fit_intercept=True, l1_ratio=0.15, learning_rate='constant', loss='squared_loss', max_iter=1, n_iter=None, penalty=None, power_t=0.25, random_state=42, shuffle=True, tol=None, verbose=0, warm_start=True))