Entering edit mode
2.4 years ago
coboyfan12
•
0
I assume there would be a simple statistical test to perform this, but I can't find anything while searching.
For example:
My dataset has SNP variants that are associated with genes for n=25 "cancer" and n=25 "normal". If a gene has a variant the gene symbol is listed. So if patient five had 7 mutations in ATF4, that ATF4 would appear 7 times in the dataset for that instance.
I'm wondering if there is a statistical test or package in R to determine which genes are more "enriched" in each group.
What about genes that have no variant? You could look into Fisher Exact tests, but you'll need a way to define the universe of genes, i.e. all genes that were interrogated.
I could attain the list of all genes interrogated. But I'm not sure with Fisher Exact Tests how I would attain the specific genes that were enriched in each group?
well, you'd have
Based on your description I felt you should be able to retrieve the number of genes for each one of those four categories.
One pesky detail may be whether to count unique instances of gene names (e.g. your 7 mutations of ATF4 would give only a count of +1 since it's all for the same gene) or not.