Difference in the percentage of reads falling into intronic and coding regions
0
0
Entering edit mode
5.2 years ago
asmi.g • 0

Dear All,

I am analyzing RNA-Seq data from multiple tumor samples. Of the total 20, 10 samples were submitted in one batch, while the other 10 were sent for sequencing later on. I have mapped the data using STAR aligner and used PICARD-tools for collecting alignment and RNA-Seq metrics.

From Picard RNA-Seq metrics analysis, it looks like that the two batches show different percentages of reads (or bases) being assigned to intronic and mRNA regions (figure in the link, have shown only few samples for comparison). The difference between the two batches in this respect looks quite significant to me. My doubt is - can this be called as a "batch" effect? And can it be removed using SVA or ComBat programs?

Also, what could be the possible reasons for this difference? Is it going to significantly impact the downstream transcript quantification analysis using featureCounts or RSEM, or can it be negated by their internal normalization schemes?

P.S. I also noticed that the total no of input reads in these samples vary. But I think it's highly unlikely that the difference in library size could be a source for this systematic error.

Thanks for your valuable comments!

image

RNA-Seq • 768 views
ADD COMMENT
1
Entering edit mode

Edit for correct image link.

ADD REPLY
1
Entering edit mode

Differences in library preparation that effect fraction of DNA retained. Ideally none. We're they both poly-a selected?

Difference in intron retention between the samples. I highly doubt this, but mention it as a possibility. Were all the 10 samples in batch 2 somehow consistently different from batch 1 in their disease state?

ADD REPLY
0
Entering edit mode

Point 1: Yes, the library was prepared by Poly-A selection

Point 2: All the samples were tumor samples. The only difference is they were sent in different batches because of the delay in pathological analysis of these remaining 10 samples, (and also, the company to which the sequencing was outsourced, changed their location from city A to city B). So the lab set up would have been different I assume. But why would that cause this difference is not clear.

ADD REPLY

Login before adding your answer.

Traffic: 2259 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6