I ran the following command to extract a subset of regions in my VCF file using tabix:
tabix -h myfile.vcf.gz "11:5247360-5247664" > myfilenew.vcf.gz
For some reason, no matter how I vary my regions, one particular site: 5247358 always comes up in my output. Why is this the case ?
P.S. noob in variant calling and analysing VCF files.
Can you show the full VCF line that comes up? Probably has to do with the length of the variant.
Hello and welcome Mehulsharma.253 ,
the quotation marks shouldn't be neccessary. Could you please show the variant(s) that you don't expect? I guess it will be an insdel that overlapt the region you specify.
fin swimmer
BTW:
> myfilenew.vcf.gz
will not create a compressed file. You have to pipe the output oftabix
throughbgzip
: