When performing Fisher's exact test, should I consider all annotated or all detected transcripts?
1
0
Entering edit mode
7.0 years ago
shawn.w.foley ★ 1.3k

I've performed RNA-seq on 30 cell lines, and am trying to determine if there is an enrichment in oncogenes in genes that are highly expressed (>50 rpkm) across >15 cell lines. Of the ~20,000 annotated mRNAs, there are only ~10,000 mRNAs that are expressed in at least one cell line (rpkm > 1). When I perform my fisher's test, I will be generating a 2x2 matrix comparing highly expressed genes, oncogenes, and all detectable genes.

My question is: should I only consider detectable genes (and detectable oncogenes) when I perform my Fisher's test, or if I should consider all annotated genes?

I'm think I should only consider genes that are detectable in one or more cell lines, and subset the list of oncogenes accordingly. It would be unfair to look for an enrichment among the 20,000 annotated genes when only half of them are actually being expressed, or am I overthinking this problem?

Thank you!

RNA-Seq fisher's test expression statistics • 1.5k views
ADD COMMENT
3
Entering edit mode
7.0 years ago
Hussain Ather ▴ 990

You'd want to use the detectable genes. You can read more about Fisher's exact test in RNA-Seq in this paper. I think Table 1 might help.

ADD COMMENT
0
Entering edit mode

This looks perfect, thank you!

ADD REPLY

Login before adding your answer.

Traffic: 1620 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6