Hello, I would like a clear and simple answer on the difference of splice aware / splice unaware aligners. There is a discussion that @dfernan generated but for me the definition is not very clear. Can you help me please. :D
Hello, I would like a clear and simple answer on the difference of splice aware / splice unaware aligners. There is a discussion that @dfernan generated but for me the definition is not very clear. Can you help me please. :D
splice aware aligners take into account to 'split' the alignment only when it is compliant to the correct use of splice sites (donor & acceptor site , most typical setting: GT-AG )
splice un-aware aligners can 'split' their alignment as they see best (== maximizes their alignment score) and thus not necessarily making use of correct donor-acceptor site usage.
might be indeed, though I think that is rather a side-effect/consequence of the former rather than a 'goal' .
Most of the times (and this is hands-on observation) they both succeed in aligning roughly the same region. Not remarkably of course since these are often substantial regions (few 100bp) and thus it's very unlikely those regions could fairly match in separate regions. The main difference is more on the detailed level, by which I mean, the few nucleotides around the actual splice site. Splice aware will ensure it will be aGT--tAG , while unaware might align aG--TAG .
Big difference with the 'ungapped' aligners, they will indeed force the match as best as possible nearby using a small-ish gap in the alignment.
Splice aware aligners like tophat, STAR provide bam out put with spliced junction information (specific locations where introns are removed) whereas non splice aware aligners like bowtie2 do not do this.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Spliced transcripts will bring together ends of exons that may be much farther apart in the genome. A short illumina read can now cover that spliced junction (which it would otherwise not be able to do in the original genome). An aligner has to break that read apart and be able to locate the genomic locations where the two pieces of the read align. Splice aware aligners are able to do this (STAR, BBMap etc) as opposed to non-splice aware aligners (e.g. bowtie v.1.x)
in which situations (if any) it is preferrable to use the un-aware versions?
If you are working with smallRNA data perhaps where you generally want un-gapped alignments. Some splice-aware aligners will have settings that can allow you to turn that feature off.