首页 > 其他分享> > ML: Dimensionality Reduction - Principal Component Analysis

ML: Dimensionality Reduction - Principal Component Analysis

2022-07-16 17:31:26 作者：互联网

Source: Coursera Machine Learning provided by Stanford University Andrew Ng - Machine Learning | Coursera

Dimensionality Reduction - Principal Component Analysis (PCA)

notations:

$u_k$: the k-th principal component of variation

$z^{(i)}$: the projection of the $i$-th example $x^{(i)}$

$x_{approx}^{(i)}$: the recovered data of $x^{(i)}$ from its projection $z^{(i)}$

problem formulation:

For an $n$ dimensional input dataset, reduce it to $k$ dimension. That is, find $k$ vectors ($u_1, u_2, \cdots, u_k$) onto which to project the data, so as to minimize the projection error:

$$ error = \frac{1}{m} \sum_{i=1}^{m} \left\| x^{(i)} - x_{approx}^{(i)}\right\| ^ 2 $$

algorithm process:

1. feature scaling and mean normalization for the original dataset $x^{(i)} \in \mathbb{R}^n$

2. compute the covariance matrix $\Sigma \in \mathbb{R}^{n \times n}$:

$$ \Sigma = \frac{1}{m} \sum_{i=1}^{m} (x^{(i)})(x^{(i)})^{T} $$

Sigma = X' * X / m;

3. compute the eigenvectors of the covariance matrix using:

[U, S, V] = svc(Sigma);

$$ U = \begin{bmatrix}| & | & & | \\u_1 & u_2 & \vdots & u_n \\| & | & & | \\\end{bmatrix} $$

4. select the first $k$ columns of matrix $U \in \mathbb{R}^{n \times n}$ as the $k$ principal components:

U_reduce = U(:, 1:k);

5. project $x^{(i)}$ into a $k$ dimensional vector $z^{(i)}$:

$$ z^{(i)} = U_{reduce}^{T}x^{(i)} $$

Z = X * U_reduce;

6. reconstruction from compressed representation:

$$ x_{approx}^{(i)} = U_{reduce}z^{(i)} $$

X_approx = U_reduce * Z;

choosing the number of principal components:

The average squared projection error is:

$$ error = \frac{1}{m} \sum_{i=1}^{m} \left\| x^{(i)} - x_{approx}^{(i)}\right\| ^ 2 $$

The total variation of the dataset is:

$$ variation = \frac{1}{m} \sum_{i=1}^{m} \left\| x^{(i)}\right\|^2 $$

Typically, choose $k$ to be the smallest value so that:

$$ \frac{error}{variation} = \frac{\frac{1}{m} \sum_{i=1}^{m} \left\| x^{(i)} - x_{approx}^{(i)}\right\| ^ 2}{\frac{1}{m} \sum_{i=1}^{m} \left\| x^{(i)}\right\|^2} \leq 0.01 $$

i.e. 99% of variation is retained.

In practice, this value is found to be:

$$ \frac{error}{variation} = 1 - \frac{\sum_{i=1}^{k}S_{ii}}{\sum_{i=1}^{n}S_{ii}} $$

$$ S = \begin{bmatrix}S_{11} & & & \\ & S_{22} & & \\ & & \ddots & \\ & & & S_{nn} \\\end{bmatrix} $$

Hence, the algorithm only needs to be run once. And pick the smallest $k$ so that:

$$ \frac{\sum_{i=1}^{k}S_{ii}}{\sum_{i=1}^{n}S_{ii}} \geq 0.99 $$

usages of PCA:

data compression: reduce the memory needed & speed up the learning algorithm
data visualization: reduce the data to 2D or 3D so that they can be plotted
improper use: use PCA for regularization, because some information is lost during the process of PCA

标签：reduce,Dimensionality,frac,ML,sum,Component,approx,variation,error
来源： https://www.cnblogs.com/ms-qwq/p/16484697.html