I was looking at the read counts from my single cell sequencing data through IGV and notice that there are very little reads that mapped onto the last few exons of Ezh2, but a lot of reads that are mapped onto the first few exons.(even after normalizing for exon length) I am using 3' end sequencing so I was expecting alot more reads that are mapped onto the last few exons since that is it the 3' end. Is there a reason why this is the case, or are there any papers that share the distribution of the reads on mapped exons?
The most right side is the 3' side (exon 20). There is a lot of reads at the exon 20(UTR) as expected. Also, you can see here there are almost no reads near my last few exons, but a lot of reads in exon 5-12 over here which is the one that is unexpected to me. Is this some knid of artifact?
I mean it's not 100% convincing but a lot of your bigger peaks seem to be upstream of
AAAAA
which might be a sign that they are from non-polyA tail priming. Obviously, this is very speculative but your coverage is highest at the 3'-end as expected. And at least 3 of the 5 tallest exon peaks in your data have a clearAAAAA
motif within the exon itself. (The first biggest exon to the left of the gene name [in your screenshot] and then 2 out of the 3 exons furthest to the right).