Generally, there are thousands of genes/proteins/clinical with several samples per group in the RNA-Seq or proteomic data. Should I test the normal distribution per gene per group (plan A) or using all the data (such as a matrix containing all the genes with all the samples) (plan B) once? I am kind of confused.
For plan A, Is these mixing tests correct? There will be different tests (t-test or non-parameter test) due to the normal distribution. And how to correct the p-value to obtain the corrected p-value? Put all the p-value together from the different tests?
For Plan B, it will result in a non-parameter test most of the time due to the non-normal distribution (even with logged data).
Thank you.
I know there are some methods that need no normal distribution in RNA-Seq, such as limma-voom, DESeq2 and edgeR.