principal component analysis on pool-seq SNP data
0
0
Entering edit mode
3.0 years ago

I would like to perform principal component analysis on a pool-seq SNP dataset. I've been looking into methods for doing this, but have had trouble finding examples that may apply for pooled data as opposed to individual genotypes. For example, I'm not sure if PLINK can be used to run PCA on pooled datasets. Is anyone familiar with whether PLINK can be used for PCA on pooled SNP data, and, if not, any toolkit or approach that would be ideal to use for PCA on pooled data?

Thanks in advance!

sequencing analysis component pooled principal • 1.4k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Thanks, this tutorial is really in depth and may be useful! Do you know if PLINK can be used for pooled SNP datasets? It looks like the tutorial is for a file with individual genotypes.

ADD REPLY
0
Entering edit mode

What do you mean by "pooled"? You mean to merge different datasets? In that case, you will be dealing with potential batch and / or technical artefacts.

ADD REPLY
0
Entering edit mode

Individuals were pooled prior to sequencing, so each library contains DNA from multiple individuals. I'm still not sure about PLINK, but I did come across someone else who did use the prcomp function in base R to run PCA on pool-seq allele frequencies

ADD REPLY
0
Entering edit mode

What did you end up doing for this?

ADD REPLY

Login before adding your answer.

Traffic: 2362 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6