Genetic PCA from poolseq genotype file
1
0
Entering edit mode
6.6 years ago
AP ▴ 100

Hello,

I have a sync file extracted with Popoolation2 software that looks like that:

Contig    Position  Ref    Pool1           Pool2           Pool3           Pool4
SCAFOLD1    11722   A   330:0:0:0:0:0   315:0:0:0:0:0   334:0:0:0:0:0   111:0:0:0:0:0
SCAFOLD1    11723   T   0:330:0:0:0:0   0:316:0:0:0:0   0:334:0:0:0:0   0:111:0:0:0:0
SCAFOLD1    11725   T   0:327:0:0:0:0   0:314:0:0:0:0   0:329:0:0:0:0   0:111:0:0:0:0
SCAFOLD1    11726   A   330:0:0:0:0:0   314:0:0:0:0:0   332:0:0:0:0:0   111:0:0:0:0:0

Each cell contain the allelic counts for each basis (e.g. 330:0:0:0:0:0 for A:T:C:G:N).

I would like to perform a genetic PCA on this dataset just as one would do it on a 012 file extracted with VCFtools. I guess, one could convert the sync file with a single value per cell by adding the total number of non-reference alleles and work from that.

Does anybody have experience with that? Any opinion/comment would be very helpful.

Thanks!

PCA Poolseq genotype popoolation • 1.9k views
ADD COMMENT
0
Entering edit mode

Hi, did you find out how to perform the PCA? I also obtained a sync file using popoolations2 and a VCF using GATK and I was trying to perform a PCA using either file... but no success yet. Thank you,

Natalia

ADD REPLY
0
Entering edit mode

I managed following your method. Thanks a million!

ADD REPLY
0
Entering edit mode
5.3 years ago
AP ▴ 100

Hi Natalia,

Yes, I did manage to run a PCA using the sync file. The way I did it was to first calculate the frequency of the minor allele (or the major) of all the SNPs. Then, I ran a PCA on R using prcomp. Instead of the frequency, you can also just use the total count of the minor or major allele. You can also do the same on a 012 file.

Hope that helps! AP

ADD COMMENT

Login before adding your answer.

Traffic: 1703 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6