Difference between tSNE and PCA analysis
3
9
Entering edit mode
6.8 years ago
Qingyang Xiao ▴ 160

Hello! As clustering methods, what's the main difference between tSNE and PCA analysis?

next-gen rna-seq • 25k views
ADD COMMENT
1
Entering edit mode

Other than the math?

ADD REPLY
1
Entering edit mode

Yes, other than math, mainly for biological applications

ADD REPLY
1
Entering edit mode

It is used for dimensionality reduction and now depending upon variables and your interest of inferencing the applications will be considered. PCA has been a pretty favorite tool till date for RNA-Seq , ChIP-Seq and also WES data, but with incoming scRNASeq and also large scale SNPs data scoring population genetic inferencing t-SNE is coming handy as well. The links I have already given below in the answer should suffice. Now I will post here one more w.r.t Human Genetic Data. Your question is too broad so probably you need to do some background study. Rest it all depends on the data you will be using and depending on that your methods for dimensionality reduction will be coming into consideration.

ADD REPLY
2
Entering edit mode

Nonsense - the question isn't too broad. If someone just asks for the "main difference" you should be able to explain it in a sentence or two instead of bombarding them with links.

ADD REPLY
0
Entering edit mode

I guess you have to check what the OP wrote in comments as biological application.

ADD REPLY
0
Entering edit mode

Many thanks to these fascinating answers!

ADD REPLY
9
Entering edit mode
6.8 years ago

The main difference between t-SNE (or other manifold learning methods) and PCA is that t-SNE tries to deconvolute relationships between neighbors in high-dimensional data.

A classic example is the "swiss roll". To put the difference in layman's terms: t-SNE attempts to understand the underlying structure of the swiss roll. It does this by prioritizing neighboring points. PCA doesn't get what's going on - it doesn't see that the points are actually a line that's been rolled up.

Original data:

enter image description here

This PCA sucks (it thinks yellow is close to blue when in fact they are far away):

http://yinsenm.github.io/figure/STAT545/PCASwiss.png

In contrast, see how t-SNE seems to understand what's going on with this 'S'? enter image description here

ADD COMMENT
0
Entering edit mode

Jeremy Leipzig Are you able to re-upload the t-SNE picture?

Looks like the link has been changed and hence, the image is missing.

ADD REPLY
1
Entering edit mode

fixed this thanks

ADD REPLY
3
Entering edit mode
6.8 years ago
ivivek_ngs ★ 5.2k

I can suggest some links that will give you the flavor of both the methods that are used in dimensionality reduction.

  1. Link1
  2. Link2
  3. If w.r.t scRNA-Seq check here
  4. For bulk RNASeq check here
  5. If you are a fan of kaggle this link is pretty fun as well for usage understanding.
ADD COMMENT
1
Entering edit mode

Thanks- I think in your point 5 you forgot to actually put the hyperlink.

ADD REPLY
0
Entering edit mode

updated. Thanks for pointing it out.

ADD REPLY
3
Entering edit mode
6.8 years ago

Just a couple of comments... Neither tSNE or PCA are clustering methods even if in practice you can use them to see if/how your data form clusters. tSNE works downstream to PCA since it first computes the first n principal components and then maps these n dimensions to a 2D space. The original paper on tSNE is relatively accessible and if I remember correctly it has some discussion on PCA vs tSNE. Also, this post on tSNE is quite good, although not really about tSNE vs PCA.

ADD COMMENT
0
Entering edit mode

Nice one, that is the reason I never used the term clustering ;)

ADD REPLY

Login before adding your answer.

Traffic: 2551 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6