free性满足hd性bbw,精品国产污污免费网站入口,99久久亚洲精品日本无码

線性回歸是比較常用的模型。本文會簡單介紹線性回歸的原理，以及如何用代碼實現(xiàn)線性回歸模型。

什么是線性回歸

簡單舉一個例子。假設我現(xiàn)在需要貸款，想要知道銀行會貸多少錢給我，我從一些渠道知道銀行貸款額度跟個人年齡和工資相關(guān)。例如我獲得了一些數(shù)據(jù)：

工資	年齡	額度
4000	25	20000
5000	28	35000
7500	33	50000

我從這些數(shù)據(jù)中得到三者之間的關(guān)系，建立了一個函數(shù)模型，例如： $y = \theta_1x_1 + \theta_2x_2 + \varepsilon$ ，然后我用這個函數(shù)模型，輸入我自己的工資和年齡，從而預測我可以從銀行貸多少錢。這就是線性回歸分析，因為包含兩個自變量（工資和年齡），所以也叫多元線性回歸。其中， $\theta_1$ 和 $\theta_2$ 兩個參數(shù)表示工資和年齡對貸款額度的影響程度， $\varepsilon$ 是誤差項，服從均值為0的正態(tài)分布。

一些數(shù)學式子

假設有個線性回歸問題有m個變量，則可以建立函數(shù) $h_\theta(x) = \theta_0 + \theta_1x_1 + \theta_2x_2 + \cdots + \theta_mx_m$ ，其中 $\theta_0$ 為偏置項。式子可以整合為 $h_\theta(x) = \sum_{i = 0}^{n}\theta_ix_i = \theta^Tx$ 。

因為真實值和預測值之間肯定會存在差異的，所以對于每個樣本來說， $y^{(i)} = \theta^Tx^{(i)} + \varepsilon^{(i)}$ 。

誤差是服從均值為0的正態(tài)分布。所以 $p(\varepsilon^{(i)}) = \frac{1}{\sqrt{2\pi}\sigma}exp(-\frac{(\varepsilon^{(i)})^2}{2\sigma^2})$ 。

最小二乘法： $J(\theta) = \frac{1}{2}\sum_{i = 1}^{m}(y^{(i)} - \theta^Tx^{(i)})^2$

批量梯度下降： $\theta_j = \theta_j - \alpha\frac{1}{10}\sum_{k = i}^{i + 9}(h_\theta(x^{(k)}) - y^{(k)})x_j^{(k)}$ （ $\alpha$ 為學習率，一般較小；每次更新選擇10個數(shù)據(jù)來計算；具體數(shù)量根據(jù)實際情況調(diào)整）

代碼實現(xiàn)

數(shù)據(jù)預處理

寫一個prepare_for_training函數(shù)，對數(shù)據(jù)進行函數(shù)變換、標準化等操作。最后返回處理過的數(shù)據(jù)，以及均值和標準差。

def prepare_for_training(data, polynomial_degree=0, sinusoid_degree=0, normalize_data=True):

    # 計算樣本總數(shù)
    num_examples = data.shape[0]

    data_processed = np.copy(data)

    # 預處理
    features_mean = 0
    features_deviation = 0
    data_normalized = data_processed
    if normalize_data:
        (
            data_normalized,
            features_mean,
            features_deviation
        ) = normalize(data_processed)

        data_processed = data_normalized

    # 特征變換sinusoidal
    if sinusoid_degree > 0:
        sinusoids = generate_sinusoids(data_normalized, sinusoid_degree)
        data_processed = np.concatenate((data_processed, sinusoids), axis=1)

    # 特征變換polynomial
    if polynomial_degree > 0:
        polynomials = generate_polynomials(data_normalized, polynomial_degree)
        data_processed = np.concatenate((data_processed, polynomials), axis=1)

    # 加一列1
    data_processed = np.hstack((np.ones((num_examples, 1)), data_processed))

    return data_processed, features_mean, features_deviation

線性回歸模塊

寫一個LinearRegression類，包含線性回歸相關(guān)的方法。

    def train(self, alpha, num_iterations = 500):
        """
                    訓練模塊，執(zhí)行梯度下降
        """
        cost_history = self.gradient_descent(alpha, num_iterations)
        return self.theta, cost_history

    def gradient_descent(self, alpha, num_iterations):
        """
                    實際迭代模塊，會迭代num_iterations次
        """
        cost_history = []
        for _ in range(num_iterations):
            self.gradient_step(alpha)
            cost_history.append(self.cost_function(self.data, self.labels))
        return cost_history

    def gradient_step(self, alpha):
        """
                    梯度下降參數(shù)更新計算方法，注意是矩陣運算
        """
        num_examples = self.data.shape[0]
        prediction = LinearRegression.hypothesis(self.data, self.theta)
        delta = prediction - self.labels
        theta = self.theta
        theta = theta - alpha * (1/num_examples)*(np.dot(delta.T, self.data)).T
        self.theta = theta

    def cost_function(self, data, labels):
        """
                    損失計算方法
        """
        num_examples = data.shape[0]
        delta = LinearRegression.hypothesis(data, self.theta) - labels
        cost = np.dot(delta, delta.T)
        return cost[0][0]

    @staticmethod
    def hypothesis(data, theta):
        predictions = np.dot(data, theta)
        return predictions

    def get_cost(self, data, labels):
        data_processed = prepare_for_training.prepare_for_training(data,
                                    self.polynomial_degree,
                                    self.sinusoid_degree,
                                    self.normalize_data)[0]
        return self.cost_function(data_processed, labels)

    def predict(self, data):
        """
                    用訓練的參數(shù)模型，預測得到回歸值結(jié)果
        """
        data_processed = prepare_for_training.prepare_for_training(data,
                                    self.polynomial_degree,
                                    self.sinusoid_degree,
                                    self.normalize_data)[0]
        predictions = LinearRegression.hypothesis(data_processed, self.theta)

        return predictions

實際應用

用LinearRegression類進行建模、預測、計算損失等。

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from linear_regression import LinearRegression

data = pd.read_csv('../data/world-happiness-report-2017.csv')

# 得到訓練和測試數(shù)據(jù)
train_data = data.sample(frac= 0.8)
test_data = data.drop(train_data.index)

input_param_name = 'Economy..GDP.per.Capita.'
output_param_name = 'Happiness.Score'

x_train = train_data[[input_param_name]].values
y_train = train_data[[output_param_name]].values

x_test = test_data[[input_param_name]].values
y_test = test_data[[output_param_name]].values

plt.scatter(x_train, y_train, label='Train data')
plt.scatter(x_test, y_test, label='Test data')
plt.xlabel(input_param_name)
plt.ylabel(output_param_name)
plt.title('Happy')
plt.legend()
plt.show()

num_iterations = 500
learning_rate = 0.01

linear_regression = LinearRegression(x_train, y_train)
(theta, cost_history) = linear_regression.train(learning_rate, num_iterations)

print('開始時的損失：', cost_history[0])
print('訓練后的損失：', cost_history[-1])

plt.plot(range(num_iterations), cost_history)
plt.xlabel('Iteration')
plt.ylabel('Cost')
plt.title('GD')
plt.show()

predictions_num = 100
x_predictions = np.linspace(x_train.min(), x_train.max(), predictions_num).reshape(predictions_num,1)
y_predictions = linear_regression.predict(x_predictions)

plt.scatter(x_train, y_train, label='Train data')
plt.scatter(x_test, y_test, label='Test data')
plt.plot(x_predictions, y_predictions, 'r', label='Prediction')
plt.xlabel(input_param_name)
plt.ylabel(output_param_name)
plt.title('Happy')
plt.legend()
plt.show()

運行結(jié)果如下。

開始時的損失： 30.901003728555025
訓練后的損失： 0.3040912375218431

總結(jié)

本文只是簡單通俗地介紹下線性回歸，并沒有很嚴肅規(guī)范的內(nèi)容，具體的相關(guān)內(nèi)容需要自行查閱資料。線性回歸的代碼也比較粗糙，效果并沒有很好，甚至在迭代次數(shù)比較多的時候，損失會回升一點點。主要目的是為了了解線性回歸的整個流程。

相關(guān)代碼下載：RossHe7的GitHub

都看到最后了，要不~點個贊？加波關(guān)注？

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频

線性回歸代碼實現(xiàn)

線性回歸代碼實現(xiàn)

什么是線性回歸

一些數(shù)學式子

代碼實現(xiàn)

數(shù)據(jù)預處理

線性回歸模塊

實際應用

總結(jié)

推薦閱讀更多精彩內(nèi)容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美 国产 综合 欧美 视频

線性回歸代碼實現(xiàn)

什么是線性回歸

一些數(shù)學式子

代碼實現(xiàn)

數(shù)據(jù)預處理

線性回歸模塊

實際應用

總結(jié)

推薦閱讀更多精彩內(nèi)容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频