2016年11月7日星期一

Multilinear Principle Component Analysis

In previous blog (PCA), one dimension latent space was introduced. However, in practice, a latent space with more dimensions is more popular. To do this, a set of projections $\{w_1,w_2,...,w_n\}$ is needed. Assume that the latent space is $\{y_1,y_2,...,y_n\}, y_i\in R^d$ and original data set is $\{x_1,x_2,...,x_n\},x_i\in R^F$, then:
\[
y_i=\begin{bmatrix}
           y_{i1}\\
           ...\\
          y_{id}
       \end{bmatrix}
     =\begin{bmatrix}
          w_1^Tx_i\\
          ...\\
         w_d^Tx_i
       \end{bmatrix}
     =W^Tx_i
\]
where
\[W=[w_1,w_2,...,w_d]\]
Multilinear Principle Component Analysis could be done via maximising the variance in each dimension:
\[
\begin{align*}
W_o&=\underset{W}{\arg\max}\quad\frac{1}{N}\sum_{k=1}^d\sum_{i=1}^N(y_{ik}-u_{ik})^2\\
&=\underset{W}{\arg\max}\quad\frac{1}{N}\sum_{k=1}^d\sum_{i=1}^Nw_k^T(x_i-u_i)(x_i-u_i)^Tw_k^T\\
&=\underset{W}{\arg\max}\quad\sum_{k=1}^dw_k^TS_tw_k\\
&=\underset{W}{\arg\max}\quad tr[W^TS_tW]\\
\end{align*}
\]
\[s.t.    W^TW=I\]
Formulate the Lagrangian:
\[
\begin{align*}
\mathcal{L}(W,\Lambda)&=tr[W^TS_tW] -tr[\Lambda(W^TW-I)]\\
\frac{\partial \mathcal{L}(W,\Lambda)}{\partial W}&=2S_tW - 2W\Lambda\\
\end{align*}
\]

Let  $\frac{\partial \mathcal{L}(W,\Lambda)}{\partial W}=0$
\[\Rightarrow\quad S_tW=W\Lambda\]

Take the above formula back to the original optimisation equation:
\[
\begin{align*}
W_o&=\underset{W}{\arg\max}\quad tr[W^TS_tW]\\
&=\underset{W}{\arg\max}\quad tr[W^TW\Lambda]\\
&=\underset{W}{\arg\max}\quad tr[\Lambda]\\
\end{align*}
\]
Therefore, one needs to maximise the eigenvalues of $S_t$, and the projection matrix $W=\{w_1,w_2,...,w_d\}$ corresponds to the eigenvectors of the largest $d$ eigenvalues.

Done.

没有评论:

发表评论