Machine & Deep Learning Compendium

Searchβ¦

Dimensionality Reduction Methods

β

- 1.β
**Stat quest****- the jist of it is that we assume a t- distribution on distances and remove those that are farther.normalized for density. T-dist used so that clusters are not clamped in the middle.**

- 1.
- 3.

- 1.
**Machine learning mastery:**- 1.
- 2.
- 3.β
**EigenDecomposition****- what is an eigen vector - simply put its a vector that satisfies A*v = lambda*v, how to use eig() and how to confirm an eigenvector/eigenvalue and reconstruct the original A matrix.** - 4.
- 5.
**What is missing is how the EigenDecomposition is calculated.**

- 2.
- 1.
**Randomized svd** - 2.
**Incremental svd**

- 3.
- 4.
- 5.
- 6.
- 7.
- 11.β
**How to use PCA in Cross validation and for train\test split****. (bottom line, do it on the train only.)** - 13.

- 1.
**Make the features less correlated with one another.** - 2.
**Give all of the features the same variance.**

- 1.
**Project the dataset onto the eigenvectors. This rotates the dataset so that there is no correlation between the components.** - 2.
**Normalize the the dataset to have a variance of 1 for all components. This is done by simply dividing each component by the square root of its eigenvalue.**

- 1.β
**First they say that****Autoencoder is PCA based on their equation, i.e. minimize the reconstruction error formula.**β - 2.
**Then they say that PCA cant separate certain non-linear situations (circle within a circle), therefore they introduce kernel based PCA (using the kernel trick - like svm) which mapps the space to another linearly separable space, and performs PCA on it,**β

- 1.
**Finally, showing results how KPCA works well on noisy images, compared to PCA.**

**PCA can be described as an βunsupervisedβ algorithm, since it βignoresβ class labels and its goal is to find the directions (the so-called principal components) that maximize the variance in a dataset.****In contrast to PCA, LDA is βsupervisedβ and computes the directions (βlinear discriminantsβ) that will represent the axes that maximize the separation between multiple classes.**

**PCA tends to outperform LDA if the number of samples per class is relatively small (****PCA vs. LDA****, A.M. Martinez et al., 2001).****In practice, it is also not uncommon to use both LDA and PCA in combination:**

β

- 1.β
**pyDML package****- has KDA - This package provides the classic algorithms of supervised distance metric learning, together with some of the newest proposals.**

β**LSA**** is quite simple, you just use SVD to perform dimensionality reduction on the tf-idf vectorsβthatβs really all there is to it! And ****LSA CLUSTERING****
**

**reduction of the dimensionality****noise reduction****incorporating relations between terms into the representation.****SVD and PCA and "total least-squares" (and several other names) are the same thing. It computes the orthogonal transform that decorrelates the variables and keeps the ones with the largest variance. There are two numerical approaches: one by SVD of the (centered) data matrix, and one by Eigen decomposition of this matrix "squared" (covariance).**

- 1.
**While PCA is global, it finds global variables (with images we get eigen faces, good for reconstruction) that maximizes variance in orthogonal directions, and is not influenced by the TRANSPOSE of the data matrix.** - 2.
**On the other hand, ICA is local and finds local variables (with images we get eyes ears, mouth, basically edges!, etc), ICA will result differently on TRANSPOSED matrices, unlike PCA, its also βdirectionalβ - consider the βcocktail partyβ problem. On documents, ICA gives topics.** - 3.
**It helps, similarly to PCA, to help us analyze our data.**

- 1.β
**The best tutorial that explains manifold (high to low dim projection/mapping/visuzation)****(pca, sammon, isomap, tsne)** - 3.

- 3.
- 4.
**In contrary to what it says on sklearnβs website, TSNE is not suited ONLY for visualization, you****can also use it for data reduction**β - 5.
**βt-Distributed Stochastic Neighbor Embedding (t-SNE) is a (****prize-winning****) technique for dimensionality reduction that is particularly well suited for the visualization of high-dimensional datasets.β** - 7.

- 1.

- 1.
- 3.
- 4.

Last modified 9mo ago