Kernel PCA

Kernel PCA

Assume that we are dealing with centred data \sum^n_(i=1)\phi(x_i)=0

The covariance matrix then takes from C=\frac{1}{n}\phi(x_i)\phi(x_i)^T

Then we have to find eigenvalues \lambda\geq0 and nonzero eigenvectors v in Hilbert space satisfying: \lambda v=Cv

All solutions v with \lambda\neq0 lie in the span of \phi(x_1),...,\phi(x_n), due to the fact that

\lambda v=Cv=\frac{1}{n}\sum^n_{i=1}\phi(x_i)\phi(x_i)^Tv=\frac{1}{n}\sum^n_{i=1}(\phi(x_i)^Tv)\phi(x_i)   (\phi(x_i)^Tv is a inner product between two vectors, it is a scalar. So we can reorder it to the front. )

We can get 2 useful consequence based on the former equation, the first is:

If we choose a point \phi(x_j) and multiply it on both side. We can get \lambda\phi({x_j})v=\phi(x_j) Cv

\lambda\langle\phi(x_j),v\rangle=\langle\phi(x_j),Cv\rangle

The second consequence is that the eigenvector can be written as a linear combination of points in Hilbert space:

v=\sum^n_{i=1}\alpha_i\phi(x_i)  (\alpha_i=\frac{\phi(x_i)^Tv}{\lambda})

Replace consequence 2 into 1:

We can get \lambda\langle\phi(x_j),\sum^n_{i=1}\alpha_i\phi(x_i)\rangle=\langle\phi(x_j),\frac{1}{n}\sum^n_{k=1}\phi(x_k)\phi(x_k)^T\sum^n_{i=1}\alpha_i\phi(x_i)\rangle    (43)

There are two properties of inner product that we should know is that:

\langle x, \lambda x'\rangle=\lambda\langle x,x'\rangle (\lambda is a constant)

\langle x,\sum_i\alpha_ix'_i\rangle =\sum_i\alpha_i\langle x,x'_i\rangle (\alpha_i is a constant)

Have known these two properties, we can write equation (43) as:

\lambda\sum^n_{i=1}\alpha_i\langle\phi(x_j),\phi(x_i)\rangle=\frac{1}{n}\sum^n_{i=1}\alpha_i\langle\phi(x_j),\sum^n_{k=1}\phi(x_k)\langle\phi(x_k),\phi(x_j)\rangle\rangle

This can be rewritten, for all j=1,…,n, as:

\lambda\sum^n_{i=1}\alpha_i\langle\phi(x_j),\phi(x_i)\rangle=\frac{1}{n}\sum^n_{i=1}\alpha_i\langle\phi(x_j),\sum^n_{k=1}\phi(x_k)\langle\phi(x_k),\phi(x_i)\rangle\rangle ;   (44)

\lambda\sum^n_{i=1}\alpha_i\langle\phi(x_j),\phi(x_i)\rangle=\frac{1}{n}\sum^n_{i=1}\alpha_i\sum^n_{k=1}\langle\phi(x_k),\phi(x_i)\rangle\langle\phi(x_j),\phi(x_k)\rangle;   (45)

n\lambda\sum^n_{i=1}\alpha_i\underbrace{\langle\phi(x_j),\phi(x_i)\rangle}=\sum^n_{i=1}\alpha_i\underbrace{\sum^n_{k=1}\langle\phi(x_j),\phi(x_k)\rangle\langle\phi(x_k),\phi(x_i)\rangle}.  (46)

The two underprices implies that: \langle\phi(x_j),\phi(x_k)\rangle=K(x_j,x_k)=K_{jk}

\sum^n_{k=1}K_{jk}K_{ki}=[K\cdot K]_{ji}=[K^2]_{ij}  (This is the definition of matrix multiplication: the jth row of K multiply with the ithcolumn of K is the ij entry of the result matrix.)

 

Kernel PCA as an eigenvector problem

Equation (46) can be rewritten as:

n\lambda\textbf{K}\pmb{\alpha}=\textbf{K}^2\pmb{\alpha}   (47)

where \pmb{\alpha} denotes the column vector with entries \alpha_1,...,\alpha_n

To find the solution of Equation (47), we solve the problem

n\lambda\pmb{\alpha}=\textbf{K}\pmb{\alpha}

which we obtain by multiplying (47) by \textbf{K}^{-1} from the left.

 

Normalizing the coefficients

We require that the eigenvectors \pmb{v}^k to have unit length, that is \langle \pmb{v}^k,\pmb{v}^k\rangle=1 for all k=1,…,r.

That means that

1=\langle\pmb{v}^k,\pmb{v}^k\rangle

=\sum^n_{i,j=1}\alpha_i^k\alpha^k_j\langle(x_i),\phi(x_j)\rangle

=\sum^n_{i,j=1}\alpha^k_i\alpha^k_jK_{i,j}

=\langle\alpha^k,\pmb{K\alpha}^k=\lambda_k \langle\alpha^k,\alpha^k\rangle

As eigenvectors of \pmb{K}, the \alpha^k have unit form.

Therefor we have to rescale them by \sqrt{\frac{1}{\lambda}} to enforce that their norm is \frac{1}{\lambda}.

Leave a comment