Hi,
During my analysis of RNA-seq data regarding alternative splicing and splicing pattern, I came across some standard options of the STAR algorithm, which I could not quiet follow.
--alignSJoverhangMin 8
(minimum overhang for unannotated junctions)
--alignSJDBoverhangMin 1
(minimum overhang for annotated junctions)
I have concerns using this standard options, which regulate the minimal overlap of a read over the exon junction. I don't understand why one nucleotide overlap is enough to map a read to an annotated exon junction. And why would you not generally use the same minimum overlaps for a junction to begin with?
So my question is, what would you use as an minimal overlap? Would you use the standard settings or for example 6nt for both junctions?
Ok I will continue using the default settings than.
However, I still can't wrap my head around why lowering the treshold for annotated junctions does not lead to missmapped reads at this position? How is this possible? Or does STAR prevent missmapping at positions of annotated junctions at another level?
I'm not sure why you'd think such mappings would be wrong, they're incredibly likely to be correct for the simple reason that the junction is annotated.
Would'nt the possibilty for wrong mapping be 3/4, when a read is mapped with just one nucleotide overlap?
EDIT: Thanks for editing btw, this forum is great! :)
You a priori expect reads spanning that junction, so no the probability would be reasonably low.