Hello friends I want to know how to test HT-Seq count data for normality in R? Any body has any idea? Thnx
Hello friends I want to know how to test HT-Seq count data for normality in R? Any body has any idea? Thnx
Fully agreeing on the comment of h.mon. RNA-seq data are not normally distributed. Plot a histogram of the counts and you will see it without any test. DESeq2 models counts as a negative binomial. There is no need to filter anything, check for any distribution (if you have standard RNA-seq data) or apply any outlier (or something else) correction. Please start with your raw HTseq counts and follow the DESeq2 manual. If there was the need for any actions to be taken it would be in the DESeq2 manual.
Please, provide more details and put some more effort in your question - for example, provide some background on why you want to perform these tests, what have you read so far, and so on.
It is widely known and accepted RNAseq count data does not follow a normal distribution, rather, it is better modeled as a negative binomial distribution. See some old posts:
Hi h.mon Thanks for responding. I want to do differential expression analysis (with DESeq2 package) and I was not sure if my HT-seq data is following normal distribution. Also I am not sure if the data is not normal it causes a problem when doing differential expression analysis. So, I wanted to get rid of outliers before doing analysis. What is your idea?
Please read the DESeq2 vignette, it tackles all your questions so far.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
I see Thanks ATpoint
A simple question. When you mentioned Histogram of counts, what did you exactly mean? Counts of a specific gene across all samples and conditions or Counts of all genes in a sample?
Additionally raw read counts is different from gene counts (like TPM/FPKM/RPKM). Even transformed gene counts follow NBD?