Hello!
It is very weird. I mapped a long-read fastq file with high quality isoforms to a reference genome, and got more mapped reads in the resulting .sam file than there were in the initial fastq file. I used parameters recommended by Cupcake: “-ax splice -t 30 -uf --secondary=no -C5” (https://github.com/Magdoll/cDNA_Cupcake/wiki/Cupcake:-supporting-scripts-for-Iso-Seq-after-clustering-step) The fastq file contained 44695 transcripts (which I counted by grep wc , and this number looks reasonable), and the mapped .sam file contains 46920 transcripts. I triple checked it.
Did someone experience something like this?
If the reads map at multiple locations you will get more 'mappings' in your sam file than there were in your fastq file
but this of course is only valid if you counted the number of mappings as such, not taking the input read IDs into account (if you filter and make unique the IDs it can never be more than the input fastq file)
This happened to me!!! I didn't specify that a read could only map in one location.