Does read length of RNA seq affects the results ?
2
1
Entering edit mode
3.9 years ago
chaudharyc61 ▴ 100

Hello everyone

As my question in my title says "Does read length of RNA seq affects the results ?" So I ahve a wild type of 75 BPs paired end data and mutant is of 150 BPs paired end.

After mapping does that affects the DEGs ?

Thank you Chandan kumar

RNA-Seq next-gen DESEq2 • 1.3k views
ADD COMMENT
0
Entering edit mode

https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0697-y

As noted by @ATPoint you should use comparable lengths in a single analysis at starting point.

ADD REPLY
1
Entering edit mode

@OP, this is actually a general principle. If you compare groups in a statistical framework you must make sure the only difference between them is the biological effect you want to test. Everything else that is specific for group would be a confounder.

ADD REPLY
1
Entering edit mode
3.9 years ago
ATpoint 85k

I would anticipate that impact would be minor on the global scale but individual genes might be affected. Longer reads improve alignment. False alignments could be reduced since longer reads are more unique. In order to avoid mappability bias I would probabl trim all data to a constant length, for example with seqtk, and then remap.

The fact that both groups differ in sequencing implies that they might have been produced at different timepoints, is that the case? If so the experiment would be confounded, hopefully the confounding effect does not mask any meaningful biological effects. Can you elaborate?

ADD COMMENT
0
Entering edit mode

Yes, both data groups are sequenced at different timepoints.

ADD REPLY
0
Entering edit mode
3.9 years ago
ponganta ▴ 590

What kind of analyses do you want to conduct? How do you quantify (mapping or quasi mapping?), what kind of reference do you utilise? Do you want to compare WT and mutant under certain conditions?

To add to @ATpoint and @GenoMax, if you want to find DEGs between WT and mutant, you might see a pretty hefty batch effect. Make sure to investigate those effects prior to DGE-analyses via clustering and PCA of samples.

ADD COMMENT
1
Entering edit mode

The OP states that the read length is entirely confounded with biological condition. Thus, you won't be able to see this as a batch effect on a PCA.

ADD REPLY
0
Entering edit mode

Unfortunately, the OP also states that both libraries were constructed in different experiments, hence the likely batch effect I mentioned. Sorry for my imprecise wording! Maybe comBat will be of use here? But to @chaudharyc61: I doubt that you can succesfully conduct DGE-analyses in this situation. Look out for batch effects using a PCA. If you find that PC1 explains most of the variation and clearly seperates WT and mutant in two, this will be indicative of a batch effect due to different experiments (i.e. different libraries made by different people at different times with different technology) being compared.

ADD REPLY
1
Entering edit mode

If group is confounded by batch you cannot correct it. If groups separate then this can be due to biology or batch, or both. No way to tell.

ADD REPLY
0
Entering edit mode

I concur. When group is 100% confounded it is mathematically impossible to correct it, irrespective of how fancy the tool you use is.

ADD REPLY

Login before adding your answer.

Traffic: 2778 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6