I'm interested in comparing the genes that are expressed in several cell types in order to infer functionings of these cell types.
I use DESeq to find genes that are differentially expressed (DE). However, a gene doesn't always have to be DE to be involved in the functionings of a cell. So ideally I'd like to come up with a reasonable method to find genes that are "expressed".
Some previous members of my lab came up with the Quartile Expression method: a gene is considered expressed if its transcript count (normalised but not transformed) is in the upper quartile of all genes across all samples. A major issue: Some genes just have much more transcript counts, so they're always expressed (across all samples), and this skews the Q values.
Anyone can point me to a reference where other methods have been attempted? Any thoughts?
Is this a question about actual bioinformatics methods, or is this a philosophical question about what it means to “express” a gene?
Probably stats/ bioinformatics. Especially the part on dealing with genes with exceptionally large transcript counts across all samples.