I've been exploring PCA analysis of a microarray experiment. I have a microarray of two genotypes of transgenic plants induced under 6 different time points. I have the following exploratory questions I would like answered with the PCA.
1) where does most of the variation lie?
2) which time points are similar and which ones differ in their effect on gene expression?
3) do the two genotypes differ under conditions of similar treatment?
I have the above questions figured out. What I am not able to analyse using PCA is the following:
4) Which gene expression values contribute the most to the observed variation?
5) How do I visualise and represent only these genes on a PCA graph?
6) How do I block out the unimportant genes which don't contribute to the observed variation?
The packages that I have explored so far are ggfortify, ggplot and ggbiplot. I don't seem to be able to find any tutorials that teach me how to answer the above three questions using these three packages.
First of all:
7) is it possible to answer such questions as in 4,5 and 6?
8) if yes, can someone point me to a tutorial which shows how this is done?
These would provide a good start:
PCA in a RNA seq analysis
PCA plot from read count matrix from RNA-Seq
@Kevin is active on Biostars so should answer any remaining questions.