Mapping Dna Reads To Reference Are Skipping Introns
1
4
Entering edit mode
12.6 years ago
Vikas Bansal ★ 2.4k

Hi,

I will describe an example from my sequencing data to make it easy for understand. Say, we captured some exons and sequenced by 454. So I have long reads ~ 300 bp long. When I did mapping, for some reads I have seen that half part of read is mapped to end of exon1 of a gene and half part is mapped to beginning exon2 of same gene. Now I have some explanations in my mind-

  1. May be there is intronic deletions in these parts.
  2. There is mRNA contamination (I want to know if it is possible).
  3. There is retrocopy inserted in genome of individual whose exons we have sequenced.

Are there any other explanations for this scenario and how can I make it sure that which case is true?

mapping intron mrna • 2.7k views
ADD COMMENT
2
Entering edit mode

Broad found this as well and in the end, they excluded case 2 and believed this is your case 1. I forgot how/whether they excluded case 3, which seems to me the most likely.

ADD REPLY
0
Entering edit mode

Thanks. As you said that case 3 seems to be most likely, can you please suggest how should I decide between case1 and case3 (Bioinformatic approach :))

ADD REPLY
0
Entering edit mode

I guess easy things to check include: a) whether there are reads containing the intron; b) if there are copy number changes, though this is hard for exome sequencing.

ADD REPLY
0
Entering edit mode

Just so that you have all cases at hand, how about mapping error due to a repetitive region? I have not worked with exome sequenced data or with 454. However, even though the reads are long here (in comparison to RNA-Seq), is there a possibility that the place where you find the reads spliced are repetitive? Do you see a few reads mapping across exon-intron junction and others mapping across exon-exon junction or are the reads always spliced?

ADD REPLY
1
Entering edit mode
12.6 years ago

In response to your comments about how to disambiguate between 1 and 3, here's me shooting from the hip (with the disclaimer is that I have really no experience doing CNV detection)

It seems to me that something about the proportions of intronic vs. junction reads would be informative, no? For instance, if it was a "hemizygous loss" of the intron, you'd have 1/2 coverage of the intron vs. the surrounding exons, no?

If it was a retrocopy, the exon-only coverage you get would look like a gene duplication, wouldn't it? (Not sure how reliably you can call duplication events from exome capture?). Furthermore, you'd have exon-exon junction reads for the full length of the transcript, no?

Just brainstorming.

ADD COMMENT
0
Entering edit mode

Thanks for your reply. I was also thinking the same thing but because there are many ups and downs in coverage during targeted sequencing and moreover we have not sequenced introns. For retrocopy as you mentioned, I think it is not always the case that whole transcript will be inserted. (Please correct me if I am wrong)

ADD REPLY
2
Entering edit mode

In biology, you'll never be wrong if you say "it's not always the case that ..." ;-)

ADD REPLY

Login before adding your answer.

Traffic: 1758 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6