We want to look into expression data of solely protein-coding genes using. Right now we pre-process by filtering out non expressed genes first, we want to know whether it's more correct to do TMM normalization on the whole dataset and include all the genes in the normalization the filter for protein coding genes then do differential gene expression analysis or filter for protein coding first then do TMM normalization then do diffrential gene expression analysis , which approach is correct? First normalization or first filtering? Literature search results in pro´s and con´s for both approaches. Are both equally bad/good?
You are doing DE analysis with TMM values?
we do TMM normalization then log2 transformation so yes we do it with these values