Interprete gene body coverage plot
1
1
Entering edit mode
5.1 years ago
wax4001 ▴ 10

I am wondering what could cause the jump in the plot ?

The plot is generated using Qualimap. BOTH the jump sample shows weird expression data that I am trying to understand why and maybe troubleshoot. I am begin to believe that the sample have DNA contamination but I am wondering how to explain this jump.

enter image description here

RNA-Seq geneBody_coverage QC • 3.5k views
ADD COMMENT
0
Entering edit mode
5.1 years ago
Jianyu ▴ 580

Usually, this kind of jump is due to an over-represented sequence. In your situation, maybe one or two over-represented sequences located in the 60%~70% position of the transcript caused this jump. If you have some programming skills, you can try to plot a heatmap showing the coverage of each transcript to find out which transcript has this kind of sequence.

ADD COMMENT
0
Entering edit mode

I do find some over-represented sequence. However, this is geneBody coverage plot. if a transcript is over-represented, why would it cause a jump at the position and that is like 10 bp ?

ADD REPLY
0
Entering edit mode

If an over-represented sequence is located in the middle of a transcript, then you will see a jump in the middle.

The length of transcripts has a large variation, some transcripts only have less than 100bp. So 10bp over-represented sequence occupied 10% region on these transcripts

ADD REPLY
0
Entering edit mode

so you are saying only a few gene's few sequence got overrepresented in the case and causing this overall jump ? I have check the alignment distribution across the genome which looks normal. DO you think removing the overrepresented sequence would be a way to correct the data ?

ADD REPLY
1
Entering edit mode

One possibility is high expression of a small RNA that is embedded in a longer sequence. For example, we still this sort of a pattern being caused by snoRNAs when we do iCLIP or NET-seq analysis.

ADD REPLY
0
Entering edit mode

Yes, I have seen some data having this problem, but there could be some other reason I don't know.

I don't know how you find the alignment distribution normal, whether to remove the overrepresented sequence should be done after you have precisely known why this jump happens.

I still recommend you to see the coverage per gene first to make decision

ADD REPLY

Login before adding your answer.

Traffic: 1758 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6