Question

Why reads not match to the genome? too many wrong bases to the genome

0

Entering edit mode

7.4 years ago

maple964 • 0

Dear all,

I am very new in this field. I faced problems that I could not find the answer.

I used IGV to see the mapped reads to the Arabidopsis genome. However, I found this. Why reads mapped to the wrong position ? or the genome reference is a problem. Does it suggest something wrong with the mapping procedure? Why can reads be mapped to the position with so many different bases?

The following is tophat code.

tophat -p 12 -o 20170905_tophat_WT -G /usr1/thhuang/index/TAIR10_GFF3_genes_transposons.gff /usr1/thhuang/index/TAIR10_chr Trim_WT.fastq.gz

Hope to find a solution

Thank you all in advance

RNA-Seq rna-seq IGV tophat • 2.3k views

ADD COMMENT • link 7.3 years ago by maple964 • 0

1

Entering edit mode

Looks like you mapped against a different genome than you're using in IGV.

ADD REPLY • link 7.4 years ago by Devon Ryan 105k

1

Entering edit mode

While the command line and the image say TAIR10 there is certainly the possibility that something is wrong as far as the genome build is concerned.

Surprising that IGV still allows one to select an incorrect genome.

ADD REPLY • link 7.4 years ago by GenoMax 148k

1

Entering edit mode

This is because of the oversimplification in SAM spec where the mismatch is reported as an M so we can only figure the mismatches from the MD tag. But the MD tag is optional hence, by default, the SAM spec does not require the aligner to report how many mismatches the alignment contains...let's just think about that for a second...

If the MD tag does not match what IGV sees then there is a discrepancy. Now it becomes a user interface issue. How many mistakes until you show the error. If it is too strict it just brings about another set of problems.

ADD REPLY • link 7.4 years ago by Istvan Albert 102k

score 1 · Answer 1 · 2017-09-05

1

Entering edit mode

7.4 years ago

Istvan Albert 102k

You are most likely visualizing the alignments against a different genomic build than what the aligner used.

Ensure that the reference genomes are the same in both cases.

ADD COMMENT • link 7.4 years ago by Istvan Albert 102k

score 0 · Answer 2 · 2017-09-12

0

Entering edit mode

7.3 years ago

maple964 • 0

Thank you all for the reply! I am appreciated. I downloaded a new reference and did the mapping using tophat again. The results showed normal.

ADD COMMENT • link 7.3 years ago by maple964 • 0