Machine Learning for beginners

POST ON 2024-03-05 BY WOLVES

Machine Learning

1.First chapter - instruction

1.1 supervised learning

  • regression
  • classification

1.2 unsupervised learing

  • Clustering 聚类算法
  • Dimensionality reduction
    • 少所考虑的随机变量数量的过程,目的是获得一组主要变量。

1.3 Jupyter Notebook

2.Supervised Learing

2.1 Linear regression

$$ f(x) = wx+b $$

  • x is input feature / variable

A experiment:

  • Cost function
    • Mean Squared Error, MSE - 均方差

$$ \text{Residual} = \hat{y}^{(i)} - y^{(i)} \ J(w,b) = \frac{1}{2m} \sum_{i=1}^{m} (y_{w,b}(x^{(i)}) - y^{(i)})^2 $$

  • goal

$$ \underset{w,b}{minimize}\ J(w,b) $$

2.2 Gridient Descent

  • 在当前的节点找下降速度最快的方向走一步,然后再找下一个方向,最终达到局部最小值loacl minima(贪心)

$$ w = w - \alpha\cdot\frac{\partial J(w,b)}{\partial w} $$

$$ b = b - \alpha\cdot\frac{\partial J(w,b)}{\partial b} $$

  • $ \alpha $ is learning rate
  • Simultaneously update
    • It mains we need calculate first , then update both of the value
  • Here is negative gradient

Learning Rate

  • Small
    • Gradient descent may be slow
  • Large
    • Gradient descent may
      • Overshoot
      • Fail to converge
      • Diverge


  • Each step of gradient descent

2.3 Multiple features (variables)

$$ \ {\vec{x}^{(2)}_{3}} = \begin{pmatrix}x_1 & x_2 & x_3 & x_4\end{pmatrix} $$

  • $x_j = j^{th}$(feature)
  • $n$ : number of features
  • $\vec{x}^{(i)}$ : features of ($i^{th}$) training example
  • $x_j^{(i)}$ : value of feature $j$ in ($i^{th}$) training example

2.3.1 Model

  • Previously $$ f_{w,b}(x) = wx + b $$

  • Now

    multiple linear regression $$ f_{w,b}(x) = w_1x_1 + w_2x_2 + \cdots + w_nx_n + b $$


    $$ \vec{w} = \begin{bmatrix} w_1 & w_2 & w_3 & \cdots & w_n \end{bmatrix} $$

    $$ \vec{x} = \begin{bmatrix} x_1 & x_2 & x_3 & \cdots & x_n \end{bmatrix} $$

    $$ f_{\vec{w} \cdot,b}(\vec{x}) = \vec{w} \cdot \vec{x} + b $$

    It is not moltivariate regression(多个自变量和因变量的关系,即矩阵和矩阵)