Entering edit mode
17 days ago
Luka
•
0
Hi, i have used 4 databases in my research, they are all RNAseq for the same type of tumors. See, my dilemma is that I have used raw counts from these databases. I have used combatseq and vst to remove batch effect and normalize data. Is this ok? I plotted the PCA for the before and after normalization. Thank you so much for your answers!
You should remove batch effects only when you have batch effects.
You should not remove batch effects if the data is different. In that case you might be removing the differences.
In general, I consider batch effect removal to be a dangerous task to embark on unless one fully understands what is going on.
Thank you Istvan for your reply. I would say that the batch is evident here no? If not could you please elaborate as I am new to this and trying out the analysis. Thank you again for your reply!
What I am saying is that there is a difference between the original counts, and the samples separate on the PCA plot.
Whether that difference is solely due to batch effects is probably a lot more difficult to establish from the image.Using the same cell line is an insufficient common element to attribute all changes to batch effects.
But first and foremost, you should plot your data while including another sample that does indeed change and is different from the others.
That way you will be able to see the relative differences between replicates and across conditions.
The changes as plotted in the original image is uninformative because it is expressed in percentage of the change but does not indicate how big the change is.