Is it possible to calculate genes which are not expressing from RNA-seq data? One obvious way is to look at gene with 0 TPM and FKPM values. For sake of simplicity, ignore false-negatives from this analysis.
However, there will be some genes which will be false positive in RNA-seq analysis and can report some low level of TPM. Is there some crude way in which I can define TPM/FKPM cut off below which I can assume these gene are not expressing? I found one reference in (Mortazavi A et al 2008) where they suggest
Transcript detection was robust at 1.0 RPKM and above for a typical 2-kilo-base (kb) mRNA
However, they had spiked their samples with standard foreign mRNA to estimate levels later. How can I get non-expressing genes information from general RNA-seq data? Any crude method should be fine as far as it is logical. One method I though was to just plot the histogram of TPM and use 0.1 quantile as cutoff.
Note: I am fine if low expressing transcripts are not included.
Hi. I have read Hart T et al paper. In the paper author used -15 (of log2(FPKM))as cut off for not expressing genes. However, It was not mentioned how the cut off was selected. In addition, I am looking for some logical reason to select cut off. Or at least some statistical method to be unbiased. I tried to fit the Gaussian to my histogram and calculate zFPKM as described in the paper but it poses same question about selecting cut off.