Removing individuals and plotting Allele Frequency Difference plots from a vcf file
0
0
Entering edit mode
22 months ago
Ollie • 0

Hello,

I have a vcf file containing multiple variants across several individuals. I need to remove several of the individuals from the vcf to focus only on those from specific populations, and then create allele frequency difference dot plots for them. I'm very new to this type of work, and don't really know where to go.

I've started to create a conda environment and install gatk into it, but I can only find how to remove variants rather than sampled individuals from the vcf. If anybody could point me in the right direction it would be greatly appreciated. I can provide more details if needed.

vcf • 1.0k views
ADD COMMENT
1
Entering edit mode

Check out bcftools view and -s parameter.

ADD REPLY
0
Entering edit mode

Thanks, I'm trying this method out now, but I'm getting an error. I've posted the error in a reply to another comment.

ADD REPLY
1
0
Entering edit mode

Thanks for this. I've just tried it out but I'm getting the following error (I'll include the code I used aswell):

bcftools view -S ^indiv.txt test_1.vcf > filtered.vcf
[W::vcf_parse] Contig '237' is not defined in the header. (Quick workaround: index the file with tabix.)
Undefined tags in the header, cannot proceed in the sample subset mod
ADD REPLY
0
Entering edit mode

Quick workaround: index the file with tabix.

Have you tried indexing the file with tabix or adding the contig to the header?

ADD REPLY

Login before adding your answer.

Traffic: 2367 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6