博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Deep Learning 2: Linear Regression Note
阅读量:4122 次
发布时间:2019-05-25

本文共 5069 字,大约阅读时间需要 16 分钟。

In this Blog, I would summarise the theory and implementation of Linear Regression.

The based materials what I used are CS229 Lecture Note Part 1 and Cousera's Machine Learning Course Lecture 2 to 3.

If you haven't read such materials, I suggest you could read it first.

 Linear Regression is the basic Problem we would concern about when we study Supervised Learning.

Since there are lots of formulas and figures, i have no time to make screen-shotcut for all of them. May i am too lazy.

This is my first blog written in english. Why? Because it is nearly impossible for us to study deep learning in chinese. I believe it is much more efficient to use english to describe all the relative things. 

OK. This blog may be in chaos, but it is in my thinking style.

OK!

Linear Regression

Simple Problem is Linear Regression with one variable

we assume a hypothesis h(x) = y ,x is a matrix of data set. we want to find a optimistic thetas to fit the h(x) therefore we could use this for predication.

we have already had some training data set. we would use this data to solve the problem.

Step 1: Cost Function  LMS (least mean squares)

Step 2: Gradient descent to find the thetas.

repeat until convergence

The key problem is to solve partial derivatives!!  

It is easy in Linear Regression, but far more complicated in Neural Network. Back Propagation is an approach.

I don't want to copy the formula here. 

Stochastic gradient descent (use all training set per iteration)

Batch gradient descent (use some or one training set per iteration, much faster ,and still easy to converge to a local optima)

Since it is a formulas solving problem. if we can set all partial derivatives to 0 and solve it to get the suitable result.

therefore the normal equations is another approach.

why might the least-squares cost function J be a reasonable choice?

there is probabilistic interpretation. 

underfitting and overfitting depend on the parameter we choose!

locally weighted linear regression :

basic idea: give different training set different weight. 

sometimes we would face such a problem that the newest data is influenced by the latest dataset. especially in time series. More close,More important!

how to set the weights? remain consideration!

OK.

Next, Linear Regression with Multiple variable!

Approach: transform h(x) into one variable problem

Above figure is from Andrew Ng's lecture 4 ppt.

Feature Scaling  Mean Normalization

Learning Rate : if alpha is too small:slow convergence. if alpha is too large: cost function may on decrease on every iteration;may not converge.

OK.

Next topic is the implementation of Linear Regression in Matlab.

we use the Assignment 1 Of ML course to explain.

1 how to compute cost function?

function J = computeCost(X, y, theta)%COMPUTECOST Compute cost for linear regression%   J = COMPUTECOST(X, y, theta) computes the cost of using theta as the%   parameter for linear regression to fit the data points in X and y% Initialize some useful valuesm = length(y); % number of training examples% You need to return the following variables correctly %J = 0;% ====================== YOUR CODE HERE ======================% Instructions: Compute the cost of a particular choice of theta%               You should set J to the cost.predictions = X*theta;sqrErrors = (predictions - y).^2;J = 1/(2*m)*sum(sqrErrors);% =========================================================================end
2 how to compute gradient descent 

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)%GRADIENTDESCENT Performs gradient descent to learn theta%   theta = GRADIENTDESENT(X, y, theta, alpha, num_iters) updates theta by %   taking num_iters gradient steps with learning rate alpha% Initialize some useful valuesm = length(y); % number of training examplesJ_history = zeros(num_iters, 1);for iter = 1:num_iters    % ====================== YOUR CODE HERE ======================    % Instructions: Perform a single gradient step on the parameter vector    %               theta.     %    % Hint: While debugging, it can be useful to print out the values    %       of the cost function (computeCost) and gradient here.    %    theta = theta - alpha/m*X'*(X*theta - y);    % ============================================================    % Save the cost J in every iteration        J_history(iter) = computeCost(X, y, theta);endend
3 how to solve normal equation

function [theta] = normalEqn(X, y)%NORMALEQN Computes the closed-form solution to linear regression %   NORMALEQN(X,y) computes the closed-form solution to linear %   regression using the normal equations.%theta = zeros(size(X, 2), 1);% ====================== YOUR CODE HERE ======================% Instructions: Complete the code to compute the closed form solution%               to linear regression and put the result in theta.%% ---------------------- Sample Solution ----------------------theta = pinv(X'*X)*X'*y;% -------------------------------------------------------------% ============================================================end
That's it!

Remember X is a matrix!

你可能感兴趣的文章
collect2: ld returned 1 exit status
查看>>
C#入门
查看>>
C#中ColorDialog需点两次确定才会退出的问题
查看>>
数据库
查看>>
nginx反代 499 502 bad gateway 和timeout
查看>>
linux虚拟机安装tar.gz版jdk步骤详解
查看>>
python猜拳游戏
查看>>
python实现100以内自然数之和,偶数之和
查看>>
python数字逆序输出及多个print输出在同一行
查看>>
苏宁产品经理面经
查看>>
百度产品经理群面
查看>>
去哪儿一面+平安科技二面+hr面+贝贝一面+二面产品面经
查看>>
element ui 弹窗在IE11中关闭时闪现问题修复
查看>>
vue 遍历对象并动态绑定在下拉列表中
查看>>
Vue动态生成el-checkbox点击无法选中的解决方法
查看>>
python __future__
查看>>
MySQL Tricks1
查看>>
python 变量作用域问题(经典坑)
查看>>
pytorch
查看>>
pytorch(三)
查看>>