How to compare SNPs in different population
1
0
Entering edit mode
9 weeks ago
Elisa • 0

Hi everyone, I am doing a study and I would like to cross methylation data (coming from Reduced representation bisulfite sequencing for now, but we plan to do it on the whole genome) with genomic SNPs (illumina) from 2 different populations of the same species (one population isolated, the other large). By following the classical pipeline I arrived at a vcf file for each individual containing the SNPs compared to the reference genome.

How can I tell which SNPs are unique to the isolated population compared to the large population? Is there then a way to extract only the SNPs involved in the CpG context?

Thanks a lot for the help

SNPs genome VCF • 278 views
ADD COMMENT
0
Entering edit mode
9 weeks ago
dthorbur ★ 2.5k

What is the "classical pipeline"? If you use a tool like GATK, you can call variants in cohorts which can be more useful for population genomic work. See this page for information and a link to their best practices.

If you want to keep the existing individually called SNPs, you can try to merge all the VCFs into a single population/experiment wide file. In either case, it's a matter or parsing the GT field for each variant locus. If using R, you can read in vcfs using the library vcfR. Or use a population genomics library like PopGenome.

And yes, if you analysed methylation state using a different tool, say methylKit for example, you can then compare the site column in the VCFs to identify SNPs within n bp of a methylation site. Lots of different functions and libraries would be useful, including subset, data.table::foverlaps, and GRanges.

ADD COMMENT

Login before adding your answer.

Traffic: 1468 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6