Entering edit mode
2.9 years ago
Apex92
▴
320
Hi, I have the same problem as mentioned in this post Interpreting splice junctions on IGV previously - could anyone please elaborate on this? On IGV guide it says When available, IGV uses the "XS" tag provided by the alignment to determine strandedness. If missing, strandeness is inferred from the read strand.
My reads are single-end and stranded (single-cell data).
Any help is highly appreciated.
Thank you.
what are the stranded counts (see here) for any particular gene?
I have a merged bam file where I converted it to sam and there are
10145130
lines (reads) in the file. Using the method below (mentioned in the link you shared) - forforwardStrandReads.sam
I do get5071949
reads and forreverseStrandReads.sam
I do get5072949
which almost adds up to the total number of reads in the main sam file. With that, do you think that reads are unstranded? I just checked my featureCounts command that I used for counting, there I had used-s0
(meaning unstranded) andSuccessfully assigned alignments percentage is 95%)
.you need to look at one gene - any gene - to see if it's stranded or unstranded
Thank you Jeremy for giving input - I looked at the bam files, my reads are unstranded (I thought they were stranded).
the XS tag is a tag added by the aligner, and indeed, if this is missing, then you probably have about a 50-50 of positive and negative reads spanning a particular splice site, resulting in it being symmetric. The XS tag infers the strand by saying: there is a splice site here, and no matter the strand indicated on the read, ON THE GENOME WHERE THE READ ALIGNS, if there are canonical splice letters (GT/AG) on the + strand, then the aligner says: this read contributes to splicing of a gene on the + strand. Similar logic for negative strand.
note also that some aligners use the TS tag instead of XS (TS being the newer, official version of indicating strandedness since XS is was not a formalized tag. minimap2 also does a thing where it outputs lower case ts which actually has a different meaning that TS, it is flipped, but that is just extra trivia)
also note: you claim your reads are "stranded" but this may not be the case, because stranded protocols would probably not appear symmetric like this (the reads would truly indicate the strand of the transcript being sequenced)