Dear all,
I am very new in this field. I faced problems that I could not find the answer.
- I used IGV to see the mapped reads to the Arabidopsis genome. However, I found this. Why reads mapped to the wrong position ? or the genome reference is a problem. Does it suggest something wrong with the mapping procedure? Why can reads be mapped to the position with so many different bases?
The following is tophat code.
tophat -p 12 -o 20170905_tophat_WT -G /usr1/thhuang/index/TAIR10_GFF3_genes_transposons.gff /usr1/thhuang/index/TAIR10_chr Trim_WT.fastq.gz
Hope to find a solution
Thank you all in advance
Looks like you mapped against a different genome than you're using in IGV.
While the command line and the image say
TAIR10
there is certainly the possibility that something is wrong as far as the genome build is concerned.Surprising that IGV still allows one to select an incorrect genome.
This is because of the oversimplification in SAM spec where the mismatch is reported as an
M
so we can only figure the mismatches from the MD tag. But the MD tag is optional hence, by default, the SAM spec does not require the aligner to report how many mismatches the alignment contains...let's just think about that for a second...If the MD tag does not match what IGV sees then there is a discrepancy. Now it becomes a user interface issue. How many mistakes until you show the error. If it is too strict it just brings about another set of problems.