Hey everybody,
I am working with an RNA-seq dataset that combines samples from strand-specific and unstranded technology. Ideally I would like to use all the samples to be able to detect DE genes between biological groups.
I am trying to deal with the problem at the gene expression level and treating the difference in technologies as batch effect, however my dataset is no very balanced (the biologic groups are not evenly distributed between the two technologies).
Has anyone dealt with a similar situation before? Is there a way to account for the difference in technologies in upstream stages (like counting)?
Thanks a lot,
Liron
Thank you very much! I will do that and see what happens! I appreciate your help, Liron
Hi Again! I tried DEseq2 and I like it! My question is how did you know that the batch effect was removed? I give the batch as a variable in the design to deseq2, and I get a set of differentially expressed genes, but how do I know that the batch was removed? Is there a way to perform pca or clustering on the data without the batch?
Please use
ADD COMMENT/ADD REPLY
when responding to existing posts to keep threads logically organized. These comments should have gone under @Kevin's answer.Derive transformed expression levels via
rlog(..., blind = FALSE)
orvst(..., blind = FALSE)
, and then check the samples on a PCA bi-plot.