Entering edit mode
4.6 years ago
asgara
•
0
Hello!
I would like to calculate the number of SNPs in a sample (.vcf file) only located in specific genomic regions (which I have as .bed file). Also, having two samples (as 2 different .vcf files), I would like to do the same as above but reporting only the SNPs which are present in sample 2 and not in sample 1, and still only located in the genomic regions provided as .bed file.
They are small files and I was thinking if that can be solved with some Python code.
Any suggestions?
I hope the description of the problem was understandable and not too confusing.
Thanks!
have a look at
bcftools view
read the manual about--regions-file
and--include
. http://samtools.github.io/bcftools/bcftools.htmlThanks for the answer! I know there are some existing tools which can probably do it smoothly, but I was wondering if the same results can be obtained also with some Python code.