Obtaining list of specific TFs from Interpro tsv file.
0
0
Entering edit mode
6.2 years ago
a.rex ▴ 350

I recently ran interpro on predicted ORFs >100aa. I then used the PFAM_DBD and SUEPRFAMILY_DBD database IDs with the hope of collecting TFs. Of course many genes have both a Homeobox hit (PF00046) as well as another hit such as PAX (PF00292).

My question is, how do people make a prediction for the number of TFs?

I simply took all the TF hits in the list and removed duplicates. Would this be valid for identifying total number of TFs?

But how can I account for specific families?

Interpro sequence • 1.2k views
ADD COMMENT
1
Entering edit mode

what kind of number of TFs are you looking for: how many different types of TFs in the genome or how many genes are potentially a TF ?

ADD REPLY
0
Entering edit mode

How many different types of TFs. Thanks

ADD REPLY
0
Entering edit mode

the more difficult one thus ;)

sounds a reasonable approach. How do you deal with a case as you described (one gene, multiple hits)? And what exactly do you mean the "how can I account for specific families" ?

perhaps you might be better of in the end by first creating gene families and then annotate them family-wise based on the genes in the family.

Looking into literature might help as well. The exists quite some TF database resources and from the papers describing them you might get some ideas. Example : plantTFDB plnTFDB(check the citation section at the bottom of the page)

ADD REPLY

Login before adding your answer.

Traffic: 1935 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6