Question

Visualize variants and percentage of variants from one sample of Amplicon Seq data?

1

Entering edit mode

22 months ago

Saran ▴ 50

Hello,

We are analyzing viral evolution by analyzing mutations present in a specific genomic location and how it evolves over time. We are performing amplicon sequencing of a specific region that is 222-228 bp at intervals. There are two versions of the virus and 3 point mutations can cause a shift from virus A to virus B. For each sample I would like to create a chart that shows the variant sequence and then the percentage of that sequence within the reads.

I performed alignment to a fasta file with only two reference lines: the 222 bp of Virus A and the 228 of virus B using BWA. I then analyzed the percentage of reads that aligned to either reference.

I now am lost on how to get the output that I would like : (This image is from CRISPResso2 but I want something similar for non-CRISPR data)

enter image description here

I used GATK - haplotype caller to call variants but don't know if that was the correct method to perform or if i should do multi-sequence alignment or something much more simple? I now have the VCF file from GATK but don't know where to go from here.

Thank you for any advice, Sara

Amplicon PCR RNAseq GATK BWA • 664 views

ADD COMMENT • link updated 22 months ago by cmdcolin ★ 4.0k • written 22 months ago by Saran ▴ 50

0

Entering edit mode

just wanted to say this is a interesting visualization that crispresso has. can you trick crispresso into using your data? also does the visualization try to filter out any "spurious read errors" or is it just raw reads?

ADD REPLY • link 22 months ago by cmdcolin ★ 4.0k