代價函數(shù)及梯度下降算法的應(yīng)用

/#1

Consider the following training set of m=4 training examples:

x	y
1	0.5
2	1
4	2
0	0

Consider the linear regression model hθ(x)=θ0+θ1x. What are the values of θ0 and θ1 that you would expect to obtain upon running gradient descent on this model? (Linear regression will be able to fit this data perfectly.)

A. θ0=0.5,θ1=0
B. θ0=0.5,θ1=0.5
C. θ0=1,θ1=1
D. θ0=1,θ1=0.5

F. θ0=0,θ1=0.5

分析解答：由四組樣本數(shù)據(jù)可以得出一個標(biāo)準(zhǔn)的一元一次線性方程，由此可求出答案是F

/#2

Let f be some function so that

f(θ0,θ1) outputs a number. For this problem,

f is some arbitrary/unknown smooth function (not necessarily the

cost function of linear regression, so f may have local optima).

Suppose we use gradient descent to try to minimize f(θ0,θ1)

as a function of θ0 and θ1. Which of the

following statements are true? (Check all that apply.)

A. If θ0 and θ1 are initialized at the global minimum, then one iteration will not change their values.
B. Setting the learning rate α to be very small is not harmful, and can only speed up the convergence of gradient descent.
C. If the first few iterations of gradient descent cause f(θ0,θ1) to increase rather than decrease, then the most likely cause is that we have set the learning rate α to too large a value.
D. No matter how θ0 and θ1 are initialized, so long as α is sufficiently small, we can safely expect gradient descent to convergen to the same solution.
```
  分析解答：學(xué)習(xí)速率影響其數(shù)據(jù)變化的快慢
```

/#3

For this question, assume that we are

using the training set from Q1. Recall our definition of the

cost function was J(θ0,θ1)=12m∑i=1m(hθ(x(i))?y(i))2.

What is J(0,1)? In the box below,

please enter your answer (Simplify fractions to decimals when entering answer, and '.' as the decimal delimiter e.g., 1.5).

    分析解答：展開公式直接帶入得0.5

多元線性回歸方程

Suppose m=4 students have taken some class, and the class had a midterm exam and a final exam. You have collected a dataset of their scores on the two exams, which is as follows:

midterm exam	(midterm exam)^2	final exam
89	7921	96
72	5184	74
94	8836	87
69	4761	78

You'd like to use polynomial regression to predict a student's final exam score from their midterm exam score. Concretely, suppose you want to fit a model of the form hθ(x)=θ0+θ1x1+θ2x2, where x1 is the midterm score and x2 is (midterm score)2. Further, you plan to use both feature scaling (dividing by the "max-min", or range, of a feature) and mean normalization.

What is the normalized feature x1(3)? (Hint: midterm = 94, final = 87 is training example 3.) Please round off your answer to two decimal places and enter in the text box below.

公式：正規(guī)方程特征 = （目標(biāo)值 - 平均值）/（Max-Min）

  分析解答：平均值為 (7921+5184+8836+4761)/4=6675.5
  Max-Min=8836-4761=4075
  (94-6675.5)/4075=-1.61509202
  保留兩位小數(shù)為-1.62

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频

Cousera——Machine Learning習(xí)題筆記

Cousera——Machine Learning習(xí)題筆記

代價函數(shù)及梯度下降算法的應(yīng)用

/#1

/#2

/#3

多元線性回歸方程

推薦閱讀更多精彩內(nèi)容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美 国产 综合 欧美 视频

Cousera——Machine Learning習(xí)題筆記

代價函數(shù)及梯度下降算法的應(yīng)用

/#1

/#2

/#3

多元線性回歸方程

推薦閱讀更多精彩內(nèi)容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频