Question:
What is the best and fastest way to query a VCF file for specific variants?
Background:
I have a tab-separated list of variants where the columns are CHROM, POS, REF, ALT. I would like to query a VCF file and get the records associated with only these specific variants. I know BCFtools, VCFtools, and tabix all allow you to supply a regions/positions file to search on CHROM and POS only, but I am interested in searching on CHROM, POS, REF, and ALT.
I also know this is very easy to do with grep, but grep doesn't take advantage of the VCF index file like the other tools do. As a result it is much slower, especially when searching very large VCF files.
hello, I was trying to use the above code but there are so many grep errors and, $3,$4"\t"}' "$variants")- for not recognizing the variant file(i used my file with the correct format) etc..errors are coming up, cld u pls share some code which works and I would be great if you could share some code having loops to extract a big list. Thanks
Please use
ADD COMMENT/ADD REPLY
when responding to existing posts to keep threads logically organized.SUBMIT ANSWER
is for new answers to original question.