I agree with Devon's reply, but I also wonder of you are looking for a gene burden test. That is, are there more rare variants in gene X (possibly damaging or not) in my cases compared to the number of rare variants in gene X in my controls (possibly damaging or not).
Take haemophilia A for example; a group of cases would have more rare variant in the Coagulation Factor VIII gene than controls would (it's a monogenic disease and almost all cases are explained by mutations in FVIII). It doesn't tell you which of the variants are pathogenic, but collectively, rare mutations in FVIII gene are more numerous in cases than controls.
Since you have a variant file (presumably vcf) of cases, and controls, you should have a look at plink/seq.
they have an awesome tutorial to work through some example data and you should to work towards the burden tests (under association tests)
Having said that, and with a nod to Devon's caution, you need to be sure the cases and controls are ethnically matched, and that the variants been called within the same exome (exomic?) regions, and sequenced using the same method, and have sufficient power to detect a significant enrichment, and boldly go where no man has gone before...
Thanks Devon for the explanation and the suggestion. I also want to ask you one additional query. If a gene has 7 SNP both in case dataset and control dataset; but the number of individual is different; then how should I do the association study. I think I cant do it SNP by SNP.
Am sorry for asking such basic question; but I am very new to human genetics.
By "the number of individual is different", do you mean between cases and controls or by SNP even within the cases and controls? Having different numbers of cases and controls is extremely common and not an issue at all (usually you have a LOT more controls than cases, since controls are easy to come by). Depending on the exact nature of the disease and data, you can sometimes test things as a group, as we did here (see supplemental table S4). You could also test SNPs individually, but that often makes more sense when you're looking at complex diseases and I assume you're working on a more Mendelian disorder.
Hi Devon,
Thanks for sharing the paper. Yah I am working on Mendelian disorder and will try to follow your advice.