I was wondering how to compare my mapping results between STAR and BWA. Can you suggest what kind of graphs (violin plots, box plots, dot plots ) to visualize the reads from two mappers for all samples? Or a statistical test? I have a table like this with the mapping results :
Uniquely mapped reads BWA % Uniquely mapped BWA Uniquilly mapped reads STAR % Uniquely mapped reads STAR
41,768,567.00 56.36% 45,568,519.00 61.49%
32,192,083.00 51.02% 35,163,750.00 55.73%
34,663,350.00 56.66% 39,451,894.00 64.49%
37,152,300.00 58.48% 41,904,245.00 65.96%
35,571,069.00 53.57% 39,641,131.00 59.71%
36,190,886.00 52.80% 40,544,827.00 59.15%
30,627,147.00 63.97% 35,736,940.00 74.64%
27,521,759.00 64.96% 32,435,090.00 76.56%
newer versions of bwa is able to align across splices - but probably not as well as aligners designed from beginning for that purpose
Aren't you confusing this with minimap2? I find no reference for that in either the bwa-mem or bwa-mem2 GitHubs. Can you link one?
Consider that bwa mem will identify nonlinear chimeric alignments (when it marks them as supplementary) and that is a more challenging task than aligning across splices.
Conceptually, spliced alignments are chimeric alignments that connect linearly.
I don't use bwa for RNA-Seq but just as an exploration, I once checked to see what bwa-mem does for RNA-Seq on a mouse genome. I was surprised to see that, based on visual inspection alone, the results seemed nearly identical.
The OP noted a lower mapping rate (though we don't know what kind of transcriptome they have) It is likely that bwa misses certain kinds of splices (perhaps short end splices) where bwa might rather soft clip than extend.
Many spliced aligners treat splice sites differently than other DNA, and with that assist the alignment process. BWA does not. One more reason spliced aligners could perform better.
Ok, so in a nutshell this is super hacky and hence not recommended (just for the general audience or inexperienced users) => use STAR.
While I do agree that better tools exist for RNA-Seq and that people should use the right tool for the job,
But I don't think this feature should be called "super hacky". Of course, it depends on what you refer to as being hacky. I just want to clarify that is nothing hacky about the alignment or what bwa does - it attempts to reflect a reality of the alignment.
We should be all happy that bwa is able to align like so - there could be many situations where if it did not do so, we would be misled and got the wrong conclusion: for example when aligning across multiple consecutive short deletions relative to the reference. So I am very appreciative that bwa can do this, feel much better looking at alignments in general, knowing I am not being misled even accidentally, all the while recommending a better tool.
The OP should not use bwa for this purpose as more appropriate tools exist for RNA-Seq - especially when it comes to distributing reads on spliced transcripts.