How to filter .vcf based on .gbk file to remove SNP calls in non-CDS regions?
1
0
Entering edit mode
5.0 years ago

I have a VCF file with multiple individuals mapped to a reference. What I would like to do is filter the VCF file so it only includes SNPs from CDS regions. I have a genbank (.gbk) from NCBI for the reference which includes CDS regions. Is there a simple way to do this? I can't seem to find any resources related to this type of filtering.

Additionally, once this filtering is complete I would like to filter synonymous SNPs from the vcf, so I am left with only non-synonymous SNPs in coding regions for my final VCF file.

SNP genome • 1.4k views
ADD COMMENT
4
Entering edit mode
5.0 years ago

convert genbank to to a snpEff database: http://snpeff.sourceforge.net/SnpEff_manual.html#databases

annotate the vcf with snpEff

filter with snpSift

ADD COMMENT
0
Entering edit mode

Fantastic, thank you!

ADD REPLY

Login before adding your answer.

Traffic: 1757 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6