Entering edit mode
5.8 years ago
efsvdo
•
0
hi, this is a new area far from my expertise and I want to check if a specific gene has SNPs but I don/t have MAF or any information to set the cohort number so I am not sure where to start? how to calculate?
You can write dbSNP or ensemble and see if gene has SNP or not.
thanks for the answer. yes, I know the gene present like 5000 or so SNPs but they are not pathogenic and I 'd like to associate with a specific condition after. I was wondering if I can sequence the gene in my problem cohort and see what I can find but I am not sure what would be the sample size and there is no information at all about this gene and relationship with the condition. thanks again
Nothing on incidence rate (of the disease)? You need to perform a power analysis in order ascertain an adequate sample size. Take a look here: A: Power Analysis for SNPs QTL GWAS
thanks. yes, the incidence of the disease is 24 % but I don't have info related to the SNPs. I want to sequence the gene in this population, I will probably find the SNPs already reported plus new ones and I want to see if there some association with the disease . most of the calculation ask me for a number I don't have for example the MAF, or the Disease Allele Frequency ratio or the Genotype Relative Risk ...
24% is quite high (?). For a disease with that incidence, there must surely be a lot of research already conducted, or is it 24% in a specific population group? Using the tools that I listed in the other thread, you will be able to determine a suitable sample size. If allele frequencies are unknown, then determine sample sizes for different levels of statistical power and for different allele frequencies.
For example, "With X controls and Y patients, we will be able to detect a disease association signal at 5% alpha and 80% power assuming an allele frequency of 5%"
there is a lot of info about the disease ( infertility) and a lot aobut the gene but not together . Would you consider a MAf of 5% for the study population and 1% for the control a good start?
I would just test varying combinations of allele frequencies for the population and controls, and come up with multiple different sample size estimates. That way, the reviewer, etc., cannot be too critical.
Ok thanks . With this percentages the sample is acceptable even low around 300 cases per group . Is there a standard/ a basic or more frequently starter numbers ? to include that ones in between. thanks again
I have seen 300 for a few studies. The larger studies have 1000s of samples,of course. There is no right or wrong answer, but obviously you cannot really do much with 10 or 20 samples.
Regarding the MAF : Is there a standard/ a basic or more frequently starter numbers ? to include that ones in between.