Can somebody suggest me how to choose the minimum FPKM value as a threshold to filter low quality transcripts?
Can somebody suggest me how to choose the minimum FPKM value as a threshold to filter low quality transcripts?
You can try to use a zFPKM transform to define expressed genes: https://bioconductor.org/packages/devel/bioc/vignettes/zFPKM/inst/doc/zFPKM.html
we provide a novel normalization metric, zFPKM, that identifies the threshold between active and background gene expression; and we show that this threshold is robust to experimental and analytical variations
Hi folks, I am having a same issue regarding cutoff FPKM value and didn't find any strong basis for the same.:( As I have done with the pooled samples I am not sure about the p-values even so I am having so much trouble during data sorting. please help me out
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
you may choose 1 fpkm as cutoff for a trancript/gene to be expressed. anything below that could be filtered out
Filter out all transcripts with FPKM variance less than one across all samples/conditions
That highly depends on what you want to do with the data afterwards.
My real aim is to find novel genes and many genes show very less value of FPKM.
I obviously don't know how well your organism of interest has been characterized, but depending on what you are studying (organism, tissue) these 'novel genes' will probably have a low expression. So throwing away lowly expressed genes is not what you want to do. Another criterion would be (if it's eukaryotic) to see if it's a spliced transcript, or if it corresponds to a conserved sequence in the genome.
You'll probably need a lot of sequencing data, and perform validation afterwards to confirm you are not looking at an artefact.