Why there are symmetric splice junctions on IGV?
0
1
Entering edit mode
2.9 years ago
Apex92 ▴ 300

Hi, I have the same problem as mentioned in this post Interpreting splice junctions on IGV previously - could anyone please elaborate on this? On IGV guide it says When available, IGV uses the "XS" tag provided by the alignment to determine strandedness. If missing, strandeness is inferred from the read strand.

My reads are single-end and stranded (single-cell data).

Any help is highly appreciated.

Thank you.

sequencing bam IGV alignment rna-seq • 1.8k views
ADD COMMENT
0
Entering edit mode

what are the stranded counts (see here) for any particular gene?

ADD REPLY
0
Entering edit mode

I have a merged bam file where I converted it to sam and there are 10145130 lines (reads) in the file. Using the method below (mentioned in the link you shared) - for forwardStrandReads.sam I do get 5071949 reads and for reverseStrandReads.sam I do get 5072949 which almost adds up to the total number of reads in the main sam file. With that, do you think that reads are unstranded? I just checked my featureCounts command that I used for counting, there I had used -s0 (meaning unstranded) and Successfully assigned alignments percentage is 95%).

samtools view Reads.bam | gawk '(and(16, $2))' > forwardStrandReads.sam
samtools view Reads.bam | gawk '(! and(16, $2))' > reverseStrandReads.sam
ADD REPLY
0
Entering edit mode

you need to look at one gene - any gene - to see if it's stranded or unstranded

ADD REPLY
1
Entering edit mode

Thank you Jeremy for giving input - I looked at the bam files, my reads are unstranded (I thought they were stranded).

ADD REPLY
0
Entering edit mode

the XS tag is a tag added by the aligner, and indeed, if this is missing, then you probably have about a 50-50 of positive and negative reads spanning a particular splice site, resulting in it being symmetric. The XS tag infers the strand by saying: there is a splice site here, and no matter the strand indicated on the read, ON THE GENOME WHERE THE READ ALIGNS, if there are canonical splice letters (GT/AG) on the + strand, then the aligner says: this read contributes to splicing of a gene on the + strand. Similar logic for negative strand.

ADD REPLY
0
Entering edit mode

note also that some aligners use the TS tag instead of XS (TS being the newer, official version of indicating strandedness since XS is was not a formalized tag. minimap2 also does a thing where it outputs lower case ts which actually has a different meaning that TS, it is flipped, but that is just extra trivia)

ADD REPLY
0
Entering edit mode

also note: you claim your reads are "stranded" but this may not be the case, because stranded protocols would probably not appear symmetric like this (the reads would truly indicate the strand of the transcript being sequenced)

ADD REPLY

Login before adding your answer.

Traffic: 1983 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6