Question

Should I remove samples after normalization of miRNA seq read counts ?

0

Entering edit mode

7.5 years ago

Björn ▴ 110

I followed https://www.bioconductor.org/help/workflows/RNAseq123/ for my rnaseq analysis. As my read counts were around 2.5 million, I had to use higher CPM. Hope this should be fine for downstream analysis.

After following command:

par(mfrow=c(1,2))                                  
lcpm <- cpm(y2, log=TRUE)                                
boxplot(lcpm, las=2, col=group$Sample, main="")            
title(main="A. Example: Unnormalised data at CPM-2",ylab="Log-cpm") 
y2 <- calcNormFactors(y2)  
y2$samples$norm.factors
lcpm <- cpm(y, log=TRUE)
boxplot(lcpm, las=2, col=group$Sample, main="")
title(main="B. Example: Normalised data at CPM-2",ylab="Log-cpm")

I got following boxplot graph. ![enter image description here][1] [1]: https://ibb.co/gWW2r6

Based on normalized data, which samples should I remove from analysis ?

rna-seq RNA-Seq edgeR miRNAs • 1.6k views

ADD COMMENT • link updated 6.7 years ago by Biostar 20 • written 7.5 years ago by Björn ▴ 110

score 0 · Answer 1 · 2018-01-31

0

Entering edit mode

7.4 years ago

Kevin Blighe 89k

Some of the samples look different from the others, in terms of their data distribution via the box-and-whiskers plot; however, I would reserve judgement on outliers without seeing, in addition, a PCA bi-plot and violin plot.

Kevin

ADD COMMENT • link 6.7 years ago by Kevin Blighe 89k