Variants only found on inversion reads in IGV
1
1
Entering edit mode
3.1 years ago
xukeren ▴ 10

Hi, I used GATK germline variant calling pipeline to call short variants on paired end fastq files. After got the final analysis ready vcf, applied some extra filters, I inspected bam files in IGV for those variants of interest and found some strange things for one sample. Two variants of interest in this sample can only be found on inversion reads.

In this first graph, the alternative allele G can only be found in RR and LL reads (blue color) in IGV. 13 out of 15 inversion reads have this G allele. enter image description here

In the second graph, similarly, the alternative allele T can only be found in inversion reads. All the inversion reads have this T allele. enter image description here

Further, I realized that all the inversion reads have same size shown in figure 3. enter image description here

I wonder if these inversions are true inversions or they are artifacts (given all the same size) thus the variants only found on these reads are also not real.

germline IGV variant GATK WGS • 2.3k views
ADD COMMENT
1
Entering edit mode

inversion reads

do you mean reads mapped on the reverse strand ? it's a known problem : https://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-13-666

ADD REPLY
0
Entering edit mode

Hi, thanks for the paper, I will read it. Not quite sure if these blue LL and RR reads are mapped on the reverse strand. This post from IGV mentioned that reads tagged with "LL" or "RR" imply inversions in sequenced DNA with respect to the reference. https://software.broadinstitute.org/software/igv/interpreting_pair_orientations Looks like these are associated with read orientation not strand?

ADD REPLY
1
Entering edit mode
3.1 years ago
crisime ▴ 290

Hi xukeren,

Not every LL or RR read shows a true inversion. I think you have to further inspect your reads in IGV if you want to understand what is going on.

It could be a ambiguous mapping due to the short length of the reads. What is the mapping quality of the reads? You should see it if you click on a read.

You should activate seeing mismatched bases if you have soft clipped bases: View->Preferences->Alignments->Show mismatched bases. Your short reads might get longer tails of colorfull sequences which were not used in mapping at this position, but might have been used for a supplementary alignment at another position. (You might have to reload your BAM to IGV to take effect)

You need to find out where the corresponding reads for each of your RR/LL reads are: Click on a specific read and look at the information in the section about "Mate". To easily visualize it, you can try turning on "View as pairs" in the rightclick menu of the BAM-file. If the corresponding reads are close enough they get connected by slim lines. The other way would be "Go to mate" after right-clicking on a specific read (only available if "View as pairs is turned off"). IGV will jump to the mapping position of the corresponding read and highlight both in a unique color.

ADD COMMENT

Login before adding your answer.

Traffic: 1942 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6