I am trying to use tophat and cufflinks to dealing with RNA-seq data, while something went wrong when running tophat. Different mapping results were generated after running Tophat with and without the GTF, ie, same read mapped to different site in two bam files.
a lot of reads have this problem, and the mapping results with GTF have many mismatches, maybe ~80 mismatch bases in one 90bp read
one is (with gtf)
FCC3378ACXX:4:1113:11999:94438# 99 scaffold_1 928636 50 90M = 928763 217 CGCGCGTCCACGAGGCCGTTGTGGTCGGCGTCCGCGCGCTCGAAGGCGCCGGCGGCCACGGC CGCGTTGGGGTTCGGGTTGGCCAGGGGC bbbceecegggggiihiifihfhehffhhhiggeebccaaaXaccccccccT_a[aacca_accccVaccac_Xaaaa`aaccb^^a]aa AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i :0 NM:i:0 MD:Z:90 YT:Z:UU XS:A:- NH:i:1
5083 FCC3378ACXX:4:1113:11999:94438# 147 scaffold_1 928763 50 90M = 928636 -217 GCAGCGGCTCGGGCGGGGACTGCGCCAGCGCCAGCGCGACGAGCAGCGAGAGCGAGAGCGCC GCTAGCTGCAGGAAGAGGCGCGGCATGG BBBBBBBBBBaa[R`_b^^cc]V^_V[___a_YXaaa[W^b__Z^a`baaac`bfedaeUdf`agdfdfeahe_gffeggge`JcaZ[__ AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i :0 NM:i:0 MD:Z:90 YT:Z:UU XS:A:- NH:i:1
Another one is (without gtf)
FCC3378ACXX:4:1113:11999:94438# 99 scaffold_1 910475 50 90M = 910602 217 CGCGCGTCCACGAGGCCGTTGTGGTCGGCGTCCGCGCGCTCGAAGGCGCCGGCGGCCACGGC CGCGTTGGGGTTCGGGTTGGCCAGGGGC bbbceecegggggiihiifihfhehffhhhiggeebccaaaXaccccccccT_a[aacca_accccVaccac_Xaaaa`aaccb^^a]aa AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i :0 NM:i:0 MD:Z:90 YT:Z:UU NH:i:1
4818 FCC3378ACXX:4:1113:11999:94438# 147 scaffold_1 910602 50 90M = 910475 -217 GCAGCGGCTCGGGCGGGGACTGCGCCAGCGCCAGCGCGACGAGCAGCGAGAGCGAGAGCGCC GCTAGCTGCAGGAAGAGGCGCGGCATGG BBBBBBBBBBaa[R`_b^^cc]V^_V[___a_YXaaa[W^b__Z^a`baaac`bfedaeUdf`agdfdfeahe_gffeggge`JcaZ[__ AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i :0 NM:i:0 MD:Z:90 YT:Z:UU NH:i:1
The command line is
tophat \
-G FM_Physo1_1.gtf \
--mate-inner-dist 20 \
--mate-std-dev 34 \
-p 11 \
-o ./tophat_out/R0h \
Psojae_genome \
R0h_1.fq \
R0h_2.fq
Could someone help me with that
FYI, I've moved this to an answer, since it sounds like this solved the problem.
That's pretty wild, but it's certainly good to know!