Hi everyone, I have downloaded the RNA_seq data for PAAD cancer type using TCGA assembler. TCGA assembler also generates a boxplot image, showing outliers. PAAD boxplot for RNA seq data
So my question is this- 1. Is, I have to remove these Outliers sample from my study, as I have to do Differential gene expression analysis. Thanks in advance.
Which outlier samples? - the data distributions across this large number of samples look quite similar. A box-and-whisker plot is only part of the story, of course. You should additionally look at a PCA bi-plot of PC1 versus PC2.
Please see How to add images to a Biostars post to add your images properly. You need to use the add image button and the direct link to the image, not the hyperlink button and link to the webpage that has the image embedded (which is what you have used here)
Alternatively, if you are using limma or edgeR you can use their robust setting when fitting the linear models so that outlier samples will have less influence on your results.