identify high and low express genes from cufflinks output
2
0
Entering edit mode
10.2 years ago
hana ▴ 190

Hi

I want to find high and low express genes from cufflinks output (genes.fpkm_tracking file) in one sample. How can I choose an FPKM cut-off to judge whether the corresponding genes are High expressed or not?

There are some papers that use 1<FPKM . By using this threshold I found ~ 11,000 genes, but I would like to find most high and low express genes.

I would be grateful if you could give me suggestion.

Thank you

RNA-Seq • 3.0k views
ADD COMMENT
1
Entering edit mode
10.2 years ago

Make a histogram of the FPKMs and see if there are multiple humps. If so, you have your threshold. If not, you get to arbitrarily pick a value that looks nice (I would personally just take the top x% as highly expressed). The FPKM<1 that you see in some papers is just a nice round number with no actual legitimacy (regardless of what anyone might say to the contrary).

ADD COMMENT
0
Entering edit mode

thank you so much

ADD REPLY
0
Entering edit mode
10.2 years ago

There are some statistics for a few cutoffs in this paper.

That said, the goal is a little different: I think 0.1 is generally a good threshold to exclude low coverage genes that will tend to have high fold-change values, but I wouldn't necessarily describe genes with > 0.1 RPKM as having "high" expression (rather I would probably just say they are "expressed")

ADD COMMENT
0
Entering edit mode

Thank you for your reply

ADD REPLY

Login before adding your answer.

Traffic: 1844 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6