Hi there,
I am quite new in the bioinformatics world and I would appreciate a bit of help, please. I would like to have an explanation about the counts distribution of a RNA-seq analysis. What I have studied and understood, is that they typically follow a negative-binomial distribution and according to that, the calculations of the DESeq function are appropiate for the analysis. My question is: if I do a "manual" filtering and remove the genes with a base mean under the threshold I choose (this filter out around a 70% of genes), when I plot the counts distribution of the remainin genes, they do not have this negative-binomial distribution as before. Instead, their distribution is more similar to a normal distribution. Then I would like to know, is this affecting the analysis? Should I consider other different ways to analyse now the data?
Thank you in advance.