I have a VCF file that contains SNP differences compared to GRCh37. How can I extract a particular nucleotide at a specific location (or sequence such as the codon), specifying location(s)?
As an example I was thinking I could use bcftools view snp.vcf.gz 12:525-525, but since my VCF file only contains variations, if there is no variation, I often get an empty result. Maybe this requires a short script where if there is no variation, the result returned comes from the reference genome. Would that be a good way of doing this, and if so, how could I pull the data from the reference gnome? Thank you in advance
For example. Is there a tool where I can specify something like, β command snp.vcf.gz 12:525-527β and the output could be a codon such as βATAβ
I think the FastAlternateReferenceMaker tool is exactly what I am looking for-thank you!