Question

to get each covariate with PC loading in PCA

0

Entering edit mode

6.1 years ago

Grace_G ▴ 20

Hello,

I got the read counts matrix of sample after RNA-Seq htseq-count, for this read counts matrix, rownames are gene id, colnames are sample id(format, organ_gender_age),then use PCA to find correction of samples, but how to each covariate(organ, gender, age) with PC loading???

Any recommend way here? Thank you in advance!

RNA-seq PCA • 2.7k views

ADD COMMENT • link 6.1 years ago by Grace_G ▴ 20

1

Entering edit mode

PCA tutorial by @Kevin : PCA plot from read count matrix from RNA-Seq

ADD REPLY • link 6.1 years ago by GenoMax 149k

0

Entering edit mode

Thanks, but not have covariate with PC loading part.

ADD REPLY • link 6.1 years ago by Grace_G ▴ 20

1

Entering edit mode

Do you have a file phenotype data? This file should include the name of your samples, conditions and any other information such as batch etc. You can use that and perform a PCA analysis in R. Make sure your counts are normalised!

ADD REPLY • link 6.1 years ago by unawaz ▴ 60

0

Entering edit mode

Thanks for your view! But a file phenotype data is used for DEseq2, isn't it? Here for PCA not use DEseq2, since all these phenotype data are used for describing samples, so I combine them as sample ID directly, so sample ID can show all of them.
read counts matrix like:

        liver_female_2 heart_male_6 lung_female_3 liver_female_1...
gene1
gene2
gene3

after t(matrix) and normalise can do pca use function, but still I wander it can get PC loading for each covariate?

ADD REPLY • link 6.1 years ago by Grace_G ▴ 20

0

Entering edit mode

You're looking for "factor analysis", which is related to PCA but not exactly the same.

ADD REPLY • link 6.1 years ago by Devon Ryan 105k

0

Entering edit mode

Thanks Ryan, glad to hear from you! I'm not sure, also not sure what as the input data, actually just want to see like pc1 or pc2 or pc10 loading age (and other covariates) most. But seem's rare PCA material about this, I'm seeking some package to do since now I also not sure the format of input data.

ADD REPLY • link 6.1 years ago by Grace_G ▴ 20

0

Entering edit mode

What are you actually trying to do? This sentence is not clear:

then use PCA to find correction of samples, but how to each covariate(organ, gender, age) with PC loading???

ADD REPLY • link 6.1 years ago by Kevin Blighe 89k

0

Entering edit mode

Yes, it like sometimes we hope wild type samples together and mutant type sample together on the PCA plot. Here I hope same organ sample together on PCA plot, this is what I mean "correction of samples". But after draw this plot, we want to know each covariate(organ, gender, age) with PC loading next, and this step I don't know how to do it.

ADD REPLY • link 6.1 years ago by Grace_G ▴ 20

0

Entering edit mode

Can you show the plot that you have, currently?

ADD REPLY • link 6.1 years ago by Kevin Blighe 89k

0

Entering edit mode

Sorry, I afraid I can't, actually it's general.

ADD REPLY • link 6.1 years ago by Grace_G ▴ 20

0

Entering edit mode

So you want something like the following but with the length of each arrow indicated?

enter image description here

ADD REPLY • link 6.1 years ago by Devon Ryan 105k

0

Entering edit mode

Thanks! I guess it's colnames(sample id) are honey, winey, body,...and my are liver_female_2 heart_male_6 lung_female_3 liver_female_1..., organ's loading(liver, heart, lung), gender's loading(female, male), age's loading(2,6,3,1). The requirement for me is to get correlate every covariate (e.g. age, gender, organ) with top6 PC loadings. the summary

ADD REPLY • link 6.1 years ago by Grace_G ▴ 20

0

Entering edit mode

Have a look at the FactoMineR package.

ADD REPLY • link 6.1 years ago by Devon Ryan 105k

0

Entering edit mode

Seems very possible, many thanks for your time Ryan!!!

ADD REPLY • link 6.1 years ago by Grace_G ▴ 20

0

Entering edit mode

hi, Ryan! Maybe the following related, however, it is really a good doc, so I share here.

Principal Variance Component Analysis (PVCA) to explore how technical and biological factors correlate with the major components of variance in the data set

3.6 F. Principal Variance Component Analysis of the raw data with the surrogate variables included as covariates link

ADD REPLY • link 6.1 years ago by Grace_G ▴ 20