I fear this question may be terribly basic. I've been asked for a list of transcription factors that are differentially expressed in an experiment. Finding differentially expressed genes is fine, but deciding which are TFs is proving a tad elusive.
The list I produced I generated by looking for the GO biological process "regulation of transcription" for each gene using biomart. If this phrase appeared somewhere I returned it, and if it didn't I filtered it out.
I got back an email saying "no way are some of these things TFs", which was frustrating. So now I'm looking for the GO molecular function "transcription factor activity" but suddenly have two questions:
- Is this any more likely to correspond to what a biologist is looking for as a TF?
- If a gene has a term that is more specific than "transcription factor activity", is there any way to see if its parent term is "transcription factor activity"?
If these questions, which I know are really basic, can be answered by a handy function in R like `is.TF(genesymbol)', that would be awesome.
Mike: You may remove the beginner tag, IMHO it is not a beginner level question. It is a real use-case for integrated bioinformatics datamining approach.
I would change the title of this question to "Determine Whether A Gene is A Transcription Factor". Do you agree?
@giovanni: you are a moderator, so IMHO you can go ahead and make questions clearer. See the "Other people can edit my stuff?!" section of the SO FAQ: http://stackoverflow.com/faq
Hi Michael, thank you but I was just asking for a confirmation, since I don't know if I understood well the question.
How did you produce your list? I've just looked up Stat1 in EnsEMBL, in biomart, at MGI and on the GO website. All of them have it annotated as a trsncription factor.
@giovanni - thanks for the edit! Always good to have the question cleaned up!
@Keith - you're right. I think I'm going to remove the list as it's now distracting rather than clarifying