Fisher Exact Test For TFs family
1
0
Entering edit mode
9.2 years ago
Kurban ▴ 230

Hello everyone,

I have finished differential expression analysis of my RNA-seq data and retrieve the transcripts of transcription factors(TF) from differentially expressed unigenes and non differentially expressed unigenes. then I have categorized those TFs to different TF families. then I realized there are many TFs sequences which belong to the same TF family overly expressed than some other TF families, some parts of my data:

TF_families   non_DE_TFs_family_No.   DE_TFs_family_No.
AP-2          2                       0
ARID          5                       2
bHLH          67                      8
CG-1          1                       1
COE           2                       0
CP2           4                       0
CSD           4                       0
CSL           1                       0
CUT           6                       1
DDT           1                       0
DM            6                       0
E2F           3                       0
ETS           10                      2
Fork_head     22                      2
..
..

and I wanna find overly expressed TF families in my DE TF sets at treated condition. I saw a several papers analysed this kind of data and they used fisher's exact test for that. I searched some materials about fisher's exact test, but in those example fisher's test used for 2x2 contingency table. here I still have no idea how to use fisher's test to my data. so could anyone explain to me how could I do this?

Fishers-Exact-Test • 2.4k views
ADD COMMENT
3
Entering edit mode
9.2 years ago
Asaf 10k

The two cells you miss are the number of overall TFs and the number of TFs not in the family but DE, then you can put the data for each family in a contingency table like:

              DE  not DE
in group      A   B
not in group  C   D

and you can compute p-value using Fisher's exact test using these tables

ADD COMMENT
0
Entering edit mode

Thanks @Asaf,

Actually I wanted to add this thanks to by adding comment, but now in china(where I am now) it is heard to even open biostars home page and it is impossible to comment others response. so I am here thank you by "add answer" and want to ask u a question.

I used the contingency table as you recommended:

(e.g. for TF family RHD)

              DE  not DE
in group      5   6
not in group  87  803

The fisher's exact test result for this table is :

Left p-value    0.999706420
Right p-value   0.002804590
2-Tail p-value  0.002804590

from the result we can see that Right P-value and 2-Tail p-value is smaller than 0.05, does that mean in treated group TFs transcripts in the transcriptome data, RHD TF family overly expressed?

Which p-value should I use?

Could you explain me some thing about this?

Thanks

ADD REPLY
0
Entering edit mode

Yes, you can say that there are more DE TFs in the RHD family than expected at random. Read a bit about the test, you'll understand what are the right and left p-values (briefly: top left cell is more than expected at random or less, respectively)

ADD REPLY

Login before adding your answer.

Traffic: 2264 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6