I would like to test a number of different genes as to whether a mutation in that gene is significantly associated with a particular group of samples. Thus, for each gene I will perform a Fisher test to compare the number of samples in group A with any mutation in gene 1, vs the number of samples in group B that has a mutation in the same gene. Repeat for X number of genes. Each group consists of 11 samples. However, I note that some of the genes are mutated in very few samples in total, say 1 or 2. In those cases, I could never get a significant p-value regardless of how the instances of this mutation were distributed across the different samples. Is it then a good idea to discard these genes from the test in order to reduce the influence of the false discovery rate correction I will need to perform? Or can it be considered "fishing" for significance? What is a sensible cutoff for the number of mutated instances to demand in that case? Using an online Fisher test, I note that one can only get a significant p-value when there are at least 5 mutated instances present (in the most optimistic scenario of all mutations belonging to one group). Would it then be wise to use a minimum of 5 mutated samples as a criterion to consider a gene for testing? (I'm asking because it is very easy to find excuses when something looks borderline significant after FDR correction...)
The statistical principle you are looking for is "independent filtering". If you google for "independent filtering gwas" you will get some ideas.
The key element is that your filtering should be performed on a metric independent on the test statistic.