I am trying to perform standard qunatification by salmon, using its default quasi-mapping mode. The index was built on the latest GENCODE:
salmon index -t gencode.v29.transcripts.fa.gz -i gencode_transcriptome_index_salmon
Then I run mapping:
salmon quant -i gencode_transcriptome_index_salmon -l SF -r mate1.fq.gz -o quant1 --writeUnmappedNames --validateMappings
(in reality I have paired-end readings, but for simplified example I show as a stranded single-end). In the output, I get >30% unmapped reads. After manual BLAST, it seems that most abundant of them actually belong to ENST00000316193.12 - transcript that looks quite ordinary, is present in the gencode, etc. From the 150nt of the read, there's just a single mismatch, with all others being a perfect match - so I really don't understand why it is unmapped. Below is a piece of the actual source fastq file with such "unmappable" reads, and then alignment from manual BLAST:
@A00261:111:HFJ5KDSXX:3:1101:28700:5259 1:N:0:AACAACCA+GGTGCGAA
CCCCGAACCACTCAGGGTCCTGTGGACAGCTCACCTAGTGGCAATGGCTCCAGGCTCCCGGACGTCCCTGCTCCTGGCTTTTGCCCTGCTCTGCCTGCCCTGGCTTCAAGAGGCTGGTGCCGTCCAAACCGTTCCGTTATCCAGGCTTTT
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00261:111:HFJ5KDSXX:3:1102:25473:13667 1:N:0:AACAACCA+GGTGCGAA
GGACAGCTCACCTAGTGGCAATGGCTCCAGGCTCCCGGACGTCCCTGCTCCTGGCTTTTGCCCTGCTCTGCCTGCCCTGGCTTCAAGAGGCTGGTGCCGTCCAAACCGTTCCGTTATCCAGGCTTTTTGACCACGCTATGCTCCAAGCCC
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFF:FFFFFFFFF:FFFFFFFFFFFFFF:FFFFFFFFFFFF,FFFFFFFFFFFF
_
Query 1 CCCCGAACCACTCAGGGTCCTGTGGACAGCTCACCTAGTGGCAATGGCTCCAGGCTCCCGGACGTCCCTGCTCCTGGCTTTTGCCCTGCTCTGCCTGCCCTGGCTTCAAGAGGCTGGTGC 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 108 CCCCGAACCACTCAGGGTCCTGTGGACAGCTCACCTAGTGGCAATGGCTCCAGGCTCCCGGACGTCCCTGCTCCTGGCTTTTGCCCTGCTCTGCCTGCCCTGGCTTCAAGAGGCTGGTGC 227
Query 121 CGTCCAAACCGTTCCGTTATCCAGGCTTTT 150
||||||||||||||| ||||||||||||||
Sbjct 228 CGTCCAAACCGTTCCCTTATCCAGGCTTTT 257
_
Query 1 GGACAGCTCACCTAGTGGCAATGGCTCCAGGCTCCCGGACGTCCCTGCTCCTGGCTTTTGCCCTGCTCTGCCTGCCCTGGCTTCAAGAGGCTGGTGCCGTCCAAACCGTTCCGTTATCCA 120
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| |||||||
Sbjct 131 GGACAGCTCACCTAGTGGCAATGGCTCCAGGCTCCCGGACGTCCCTGCTCCTGGCTTTTGCCCTGCTCTGCCTGCCCTGGCTTCAAGAGGCTGGTGCCGTCCAAACCGTTCCCTTATCCA 250
Query 121 GGCTTTTTGACCACGCTATGCTCCAAGCCC 150
||||||||||||||||||||||||||||||
Sbjct 251 GGCTTTTTGACCACGCTATGCTCCAAGCCC 280
I would be grateful if someone could explain what happens here and why such reads cannot be mapped.