回歸分析是一種非常廣泛使用的統(tǒng)計工具,用于建立兩個變量之間的關(guān)系模型。 這些變量之一稱為預(yù)測變量,其值通過實驗收集。 另一個變量稱為響應(yīng)變量,其值從預(yù)測變量派生。
在線性回歸中,這兩個變量通過方程相關(guān),其中這兩個變量的指數(shù)(冪)為1.數(shù)學(xué)上,線性關(guān)系表示當(dāng)繪制為曲線圖時的直線。 任何變量的指數(shù)不等于1的非線性關(guān)系將創(chuàng)建一條曲線。
線性回歸的一般數(shù)學(xué)方程為 -
<pre class="result notranslate" style="margin: 15px 0px; padding: 10px 5px; position: relative; width: auto; max-width: 700px; box-sizing: border-box; display: block; line-height: 1.7; background: rgb(239, 239, 239); border-radius: 3px; font-size: 14px; font-family: Consolas, "Courier New", Courier, monospace; overflow-x: auto; border: 1px solid rgb(221, 221, 221); word-wrap: break-word !important; white-space: pre-wrap !important; color: rgb(0, 0, 0); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">y = ax + b
</pre>
以下是所使用的參數(shù)的描述 -
y是響應(yīng)變量。
x是預(yù)測變量。
a和b被稱為系數(shù)常數(shù)。
建立回歸的步驟
回歸的簡單例子是當(dāng)人的身高已知時預(yù)測人的體重。 為了做到這一點,我們需要有一個人的身高和體重之間的關(guān)系。
創(chuàng)建關(guān)系的步驟是 -
進行收集高度和相應(yīng)重量的觀測值的樣本的實驗。
使用R語言中的lm()函數(shù)創(chuàng)建關(guān)系模型。
從創(chuàng)建的模型中找到系數(shù),并使用這些創(chuàng)建數(shù)學(xué)方程
獲得關(guān)系模型的摘要以了解預(yù)測中的平均誤差。 也稱為殘差。
為了預(yù)測新人的體重,使用R中的predict()函數(shù)。
輸入數(shù)據(jù)
下面是代表觀察的樣本數(shù)據(jù) -
<pre class="result notranslate" style="margin: 15px 0px; padding: 10px 5px; position: relative; width: auto; max-width: 700px; box-sizing: border-box; display: block; line-height: 1.7; background: rgb(239, 239, 239); border-radius: 3px; font-size: 14px; font-family: Consolas, "Courier New", Courier, monospace; overflow-x: auto; border: 1px solid rgb(221, 221, 221); word-wrap: break-word !important; white-space: pre-wrap !important; color: rgb(0, 0, 0); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;"># Values of height
151, 174, 138, 186, 128, 136, 179, 163, 152, 131
Values of weight.
63, 81, 56, 91, 47, 57, 76, 72, 62, 48
</pre>
LM()函數(shù)
此函數(shù)創(chuàng)建預(yù)測變量和響應(yīng)變量之間的關(guān)系模型。
語法
線性回歸中lm()函數(shù)的基本語法是 -
<pre class="result notranslate" style="margin: 15px 0px; padding: 10px 5px; position: relative; width: auto; max-width: 700px; box-sizing: border-box; display: block; line-height: 1.7; background: rgb(239, 239, 239); border-radius: 3px; font-size: 14px; font-family: Consolas, "Courier New", Courier, monospace; overflow-x: auto; border: 1px solid rgb(221, 221, 221); word-wrap: break-word !important; white-space: pre-wrap !important; color: rgb(0, 0, 0); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">lm(formula,data)
</pre>
以下是所使用的參數(shù)的說明 -
公式是表示x和y之間的關(guān)系的符號。
數(shù)據(jù)是應(yīng)用公式的向量。
創(chuàng)建關(guān)系模型并獲取系數(shù)
<pre class="prettyprint notranslate tryit" style="margin: 15px 0px; padding: 10px 5px; position: relative; width: auto; max-width: 700px; box-sizing: border-box; display: block; line-height: 1.7; background: rgb(239, 239, 239); border-radius: 3px; font-size: 14px; font-family: Consolas, "Courier New", Courier, monospace; overflow-x: auto; border: 1px solid rgb(221, 221, 221); word-wrap: break-word !important; white-space: pre-wrap !important; color: rgb(0, 0, 0); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
Apply the lm() function.
relation <- lm(y~x)
print(relation)
</pre>
當(dāng)我們執(zhí)行上面的代碼,它產(chǎn)生以下結(jié)果 -
<pre class="result notranslate" style="margin: 15px 0px; padding: 10px 5px; position: relative; width: auto; max-width: 700px; box-sizing: border-box; display: block; line-height: 1.7; background: rgb(239, 239, 239); border-radius: 3px; font-size: 14px; font-family: Consolas, "Courier New", Courier, monospace; overflow-x: auto; border: 1px solid rgb(221, 221, 221); word-wrap: break-word !important; white-space: pre-wrap !important; color: rgb(0, 0, 0); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">Call:
lm(formula = y ~ x)
Coefficients:
(Intercept) x
-38.4551 0.6746
</pre>
獲取相關(guān)的摘要
<pre class="prettyprint notranslate tryit" style="margin: 15px 0px; padding: 10px 5px; position: relative; width: auto; max-width: 700px; box-sizing: border-box; display: block; line-height: 1.7; background: rgb(239, 239, 239); border-radius: 3px; font-size: 14px; font-family: Consolas, "Courier New", Courier, monospace; overflow-x: auto; border: 1px solid rgb(221, 221, 221); word-wrap: break-word !important; white-space: pre-wrap !important; color: rgb(0, 0, 0); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
Apply the lm() function.
relation <- lm(y~x)
print(summary(relation))
</pre>
當(dāng)我們執(zhí)行上面的代碼,它產(chǎn)生以下結(jié)果 -
<pre class="result notranslate" style="margin: 15px 0px; padding: 10px 5px; position: relative; width: auto; max-width: 700px; box-sizing: border-box; display: block; line-height: 1.7; background: rgb(239, 239, 239); border-radius: 3px; font-size: 14px; font-family: Consolas, "Courier New", Courier, monospace; overflow-x: auto; border: 1px solid rgb(221, 221, 221); word-wrap: break-word !important; white-space: pre-wrap !important; color: rgb(0, 0, 0); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-6.3002 -1.6629 0.0412 1.8944 3.9775
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -38.45509 8.04901 -4.778 0.00139 **
x 0.67461 0.05191 12.997 1.16e-06 ***
Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.253 on 8 degrees of freedom
Multiple R-squared: 0.9548, Adjusted R-squared: 0.9491
F-statistic: 168.9 on 1 and 8 DF, p-value: 1.164e-06
</pre>
predict()函數(shù)
語法
線性回歸中的predict()的基本語法是 -
<pre class="result notranslate" style="margin: 15px 0px; padding: 10px 5px; position: relative; width: auto; max-width: 700px; box-sizing: border-box; display: block; line-height: 1.7; background: rgb(239, 239, 239); border-radius: 3px; font-size: 14px; font-family: Consolas, "Courier New", Courier, monospace; overflow-x: auto; border: 1px solid rgb(221, 221, 221); word-wrap: break-word !important; white-space: pre-wrap !important; color: rgb(0, 0, 0); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">predict(object, newdata)
</pre>
以下是所使用的參數(shù)的描述 -
object是已使用lm()函數(shù)創(chuàng)建的公式。
newdata是包含預(yù)測變量的新值的向量。
預(yù)測新人的體重
<pre class="prettyprint notranslate tryit" style="margin: 15px 0px; padding: 10px 5px; position: relative; width: auto; max-width: 700px; box-sizing: border-box; display: block; line-height: 1.7; background: rgb(239, 239, 239); border-radius: 3px; font-size: 14px; font-family: Consolas, "Courier New", Courier, monospace; overflow-x: auto; border: 1px solid rgb(221, 221, 221); word-wrap: break-word !important; white-space: pre-wrap !important; color: rgb(0, 0, 0); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;"># The predictor vector.
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
The resposne vector.
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
Apply the lm() function.
relation <- lm(y~x)
Find weight of a person with height 170.
a <- data.frame(x = 170)
result <- predict(relation,a)
print(result)
</pre>
當(dāng)我們執(zhí)行上面的代碼,它產(chǎn)生以下結(jié)果 -
<pre class="result notranslate" style="margin: 15px 0px; padding: 10px 5px; position: relative; width: auto; max-width: 700px; box-sizing: border-box; display: block; line-height: 1.7; background: rgb(239, 239, 239); border-radius: 3px; font-size: 14px; font-family: Consolas, "Courier New", Courier, monospace; overflow-x: auto; border: 1px solid rgb(221, 221, 221); word-wrap: break-word !important; white-space: pre-wrap !important; color: rgb(0, 0, 0); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;"> 1
76.22869
</pre>
以圖形方式可視化回歸
<pre class="prettyprint notranslate tryit" style="margin: 15px 0px; padding: 10px 5px; position: relative; width: auto; max-width: 700px; box-sizing: border-box; display: block; line-height: 1.7; background: rgb(239, 239, 239); border-radius: 3px; font-size: 14px; font-family: Consolas, "Courier New", Courier, monospace; overflow-x: auto; border: 1px solid rgb(221, 221, 221); word-wrap: break-word !important; white-space: pre-wrap !important; color: rgb(0, 0, 0); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;"># Create the predictor and response variable.
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
relation <- lm(y~x)
Give the chart file a name.
png(file = "linearregression.png")
Plot the chart.
plot(y,x,col = "blue",main = "Height & Weight Regression",
abline(lm(x~y)),cex = 1.3,pch = 16,xlab = "Weight in Kg",ylab = "Height in cm")
Save the file.
dev.off()
</pre>
當(dāng)我們執(zhí)行上面的代碼,它產(chǎn)生以下結(jié)果 -