### Why Principal Component Analysis (PCA) works?

Reading time: 15 minutes

Principal component analysis (PCA) is a technique to bring out strong patterns in a dataset by supressing variations. It is used to clean data sets to make it easy to explore and analyse. The algorithm of Principal Component Analysis is based on a few mathematical ideas namely:

- Variance and Convariance
- Eigen Vectors and Eigen values

You need to understand the philoshophical aspects of the associated mathematical operations to understand why Principal Component Analysis works as it is.

While PCA is a very technical method relying on in-depth linear algebra algorithms, it’s a relatively intuitive method when you think about it.

### Intuition behind Covariance matrix

If you remember, we calculate the covariance matrix ZᵀZ for the data set.

Covariance Matrix is a matrix that contains estimates of how every variable in Z relates to every other variable in Z. Understanding how one variable is associated with another is quite powerful.

### Intuition behind Eigenvectors

We have calculated the eigenvalues and eigenvectors of the covariance matrix.

**Eigenvectors** represent **directions**. Think of plotting your data on a multidimensional scatterplot. Then one can think of an individual eigenvector as a particular “direction” in your scatterplot of data.

**Eigenvalues** represent **magnitude**, or importance. Bigger eigenvalues correlate with more important directions.

Finally, we make an assumption that more variability in a particular direction correlates with explaining the behavior of the dependent variable. Lots of variability usually indicates signal, whereas little variability usually indicates noise. Thus, the more variability there is in a particular direction is, theoretically, indicative of something important we want to detect.

Thus, PCA is a method that brings together the following key ideas:

- A measure of how each variable is associated with one another. (Covariance matrix.)
- The directions in which our data are dispersed. (Eigenvectors.)
- The relative importance of these different directions. (Eigenvalues.)
- PCA combines our predictors and allows us to drop the eigenvectors that are relatively unimportant.