Hi,
I need to filter a VCF file keeping only those SNPs that match with a separate list containing 3 columns: their ID, their reference allele and their alternate allele.
I am very new to this kind of procedure so I am trying to understand the most effective strategy to work on this.
I have been suggested to use VCFtools or BCFtools, but I am not sure I can select variants also on the basis of their ref/alt alleles. Is it possible to do this just using the command line?
Thank you
hey, this works but it takes a very long time. What I did instead was adding reference and alternate allele letters to SNP id column and then use VCFtools to make selection.