I'm interested in the change in gene expression in T-Cells during the immune response. One way you can distinguish different functions of T-Cells is by which surface proteins they express at different times along the way. The trouble is that these surface proteins are such excellent discriminators that we use a small number of them to sort the cells out into different phenotypes. Hence, when I come along and apply my fancy differential gene expression tool to find out what's changing during the immune response all I find is a whole ton of other surface receptors that change with the phenotype.
This has a couple of problems: the first is that it's kind of boring - all I've discovered is that lots of surface receptors change during the immune response, which I already know. The second is that my bio colleagues are less interested in surface proteins as they're not very well conserved between organisms.
So I was thinking of removing all of the genes that are expressed in a couple of KEGG pathways (the cytokine-cytokine signalling and T-Cell receptor pathways) before I start my analysis and proceed. Two questions:
1) Is this a bad idea for some reason that I haven't realised? Are there more appropriate ways of filtering lists of genes before analysing expression?
2) Assuming it's an awesome, well-motivated-by-the-biology-and-the-data idea, are there any tools, preferably in bioconductor, that can give me a simple yes/no answer to the question: is this gene in either of these pathways?
Oh, I think question 2 has been answered already - if this is the case I'll just check it works and remove this bit from my original question. Sorry for missing it before asking!
I gave you some suggestions for question #2