Hello, I am analysing an RNA-seq dataset of 45 samples (15 conditions x3 replicates per condition) using SeqMonk. I understand that if the cumulative distribution plot shows a big divergence between individual samples, I should consider normalising (percentile normalisation quantitation in the case of a "vertical" shift, i.e. parallel lines caused by contamination by a single/few genes, commonly rRNA; or DNA contamination normalisation in case of "horizontal" shift at base of plot indicating DNA contamination). With this cumulative distribution plot seen below, I can't tell if the difference between the lines is dramatic enough to warrant a normalisation of any sort (most likely, percentile normalisation, as they are parallel). As you can see in the second image, there was effectively no rRNA contamination in my samples, based on earlier QC steps. Any advice will be much appreciated, thanks!
you can take out read count using featurecount and run the deseq2 pipeline which is well documented first thing to see is PCA and clustering. That would give idea, although i have used seqmonk quite a bit of times but i would recommend go explore with the deseq2
Hi, many thanks for your reply! Just to make sure I've understood you, what do you mean when you say "take out read count"? Do you mean exclude it in any way? Like, normalise and not deal with raw reads?
no i mean use bamfile to get count table for all the samples with genes containing in rows and samples in columns.
yes "Like, normalise and not deal with raw reads"