Entering edit mode
6.6 years ago
star
▴
350
I have a RNA-seq samples that are SE with 50 bp and I would like to do DGE comparison with another set of RNA-seq data that are PE with the length of 100 bp. I would like to know is there any noticeable problem, comparing SE sequence with PE sequence?
Were the different sample sets from different labs or even from the same lab but taken at different times/conditions? If so, I would be extremely wary of batch effects, which often have a very strong influence over RNA-seq data.
In terms of differences purely between 50 bp SE vs 100 bp PE, the latter will give give you a higher proportion of confidently correctly mapped reads. This will be especially true for repetative regions, which will be comparatively underrepresented or inaccurately quantified in the SE samples.
If you include the type of sequencing (e.g. SE or PE) as a single covariate in your design model, then you should not have major issues. Obviously, it is better to have a good experimental design from the very start.
As I understand it, the OP is trying to compare expression in one set of samples done with SE with that of another set done with PE, so adding SE/PE as a covariate will make no difference. If that's not the case then yes, you're correct.
You're correct - it is not 100% clear in the question.
star, is the experimental setup the following:
...or is it:
?
@ Kevin Blighe: thanks a lot, no I would like compare All SE versus All PE, that the latter is from another study.
@ dr_bantz: Thanks a lot. These two sets of samples are from two different experiences, SE data is from our lab and the PE data is from another study with someone else.