Hi,
In terms of FPKM filtering why do people carry out this process? May be an obvious question, but was does it ultimately achieve?
-Is it to remove lowly expressed possible contaminating reads from other organisms which may live within the same environment? -Is it to remove 'background noise'
If someone where to carry out FPKM filtering, how does one decide a threshold. Should it be density plot of FPKM of each sample used in assembly?
Lastly, I have seen values of 0, 0.3, 1 and 1.5 FPKM being used as a threshold. Is this arbitary or do people select based on a certain parameter or decision in the data?
Would be keen to see what people think, and also the information people can provide on the matter.
Thanks.
Hi ryan, Thanks for the suggestion. That was interesting as I too have the same question. Good to know about the zFPKM method. So I used the script available online "https://github.com/severinEvo/gene_expression/blob/master/zFPKM.R" After computing zpkm for every transcript, it once again gave me the output with values ranging from-3 to 8. Now from these zfpkm values how to find the threshold. Kindly guide me, if I misunderstood anything. Thanks in advance.
The zFPKM paper recommended a threshold of -3 (see table 1 in the paper). Perhaps the script does filtering for you, I've never used it.