Entering edit mode
3 months ago
DareDevil
★
4.3k
Hello,
I have a VCF file generated from GATK and the reference genome used for the analysis. I'm interested in detecting amino acid changes resulting from the variants listed in the VCF file without going for annotation like VEP or other tools. I understand that this involves interpreting the variants within coding regions and translating the resulting nucleotide changes into amino acids.
Specifically, I would like to:
- Parse the VCF file to extract variant information.
- Use the reference genome to determine the sequence context around each variant.
- Identify the codon changes and translate them to detect amino acid changes.
that is basically what VEP and SNPEFF, bcftools csq, etc... do ...
You would need also a GFF/GTF file with gene models annotated in that reference genome. You could then
bedtools intersect
your genomic variants with CDS features and work out the codon changes considering the phase.