Hi,
Is the code right for to use pca show correlation of samples?
gene_tpm_matrix rownames are genes, colnames are samples (WT and MT)
prcomp(log2(gene_tpm_matrix+1), scale=T)
I'm not sure my code(use tpm do log and use scale, I see someone use score), however, so did a test, I use one sample we can call it sample_MT_1 * 10000 to get sample_new, then add sample_new as one line to gene_tpm_matrix, then do PCA, but sample_new point far away from sample_mt_1 point, I think they should overlap completely since they are linearly correlated and PCA which uses linear arithmetic.
What worries me most is the pca code is right or not, is anybody familiar with it?
Thanks!
Exactly, I forgot to write this step here, so it is
prcomp(t(log2(gene_tpm_matrix+1)), scale=T)
now. Thanks!