What are the benefits/disadvantages of allowing multi-hits in tophat ?
Since tophat2.0 the default mapping setting allows for multi-mappers or multi-hits.
Documentation of Tophat 2.0.0
...
In addition to reporting the best (or primary) alignments (the original TopHat behavior), TopHat 2 can report the secondary alignments up to 20 (the default) paired or single alignments (see --report-secondary-alignments and -g/--max-multihits)
These hits are reported as separate reads and samtools hence counts them as independent entities. So you end up with more "reads" in your bam file than the fastq file had.
| fastq file | 73485586 | | accepted reads | 86649216 | 118% | | mapped | 77309546 | 105% | | paired | 70973292 | 97% |
Assuming that other downstream programs will see them as independent reads too, especially when the bam is location-sorted, my questions are hence:
- Have you observed artifacts from these multi-mappers ?
- When controlling this with '--max-multihits 1' are the locations picked at random ?
- Or is it best to just ignore them altogether '--max-multihits 0'