SNps and calculation of sample size
1
0
Entering edit mode
5.7 years ago
efsvdo • 0

hi, this is a new area far from my expertise and I want to check if a specific gene has SNPs but I don/t have MAF or any information to set the cohort number so I am not sure where to start? how to calculate?

snps • 2.9k views
ADD COMMENT
0
Entering edit mode

You can write dbSNP or ensemble and see if gene has SNP or not.

ADD REPLY
0
Entering edit mode

thanks for the answer. yes, I know the gene present like 5000 or so SNPs but they are not pathogenic and I 'd like to associate with a specific condition after. I was wondering if I can sequence the gene in my problem cohort and see what I can find but I am not sure what would be the sample size and there is no information at all about this gene and relationship with the condition. thanks again

ADD REPLY
0
Entering edit mode

Nothing on incidence rate (of the disease)? You need to perform a power analysis in order ascertain an adequate sample size. Take a look here: A: Power Analysis for SNPs QTL GWAS

ADD REPLY
0
Entering edit mode

thanks. yes, the incidence of the disease is 24 % but I don't have info related to the SNPs. I want to sequence the gene in this population, I will probably find the SNPs already reported plus new ones and I want to see if there some association with the disease . most of the calculation ask me for a number I don't have for example the MAF, or the Disease Allele Frequency ratio or the Genotype Relative Risk ...

ADD REPLY
0
Entering edit mode

24% is quite high (?). For a disease with that incidence, there must surely be a lot of research already conducted, or is it 24% in a specific population group? Using the tools that I listed in the other thread, you will be able to determine a suitable sample size. If allele frequencies are unknown, then determine sample sizes for different levels of statistical power and for different allele frequencies.

For example, "With X controls and Y patients, we will be able to detect a disease association signal at 5% alpha and 80% power assuming an allele frequency of 5%"

ADD REPLY
0
Entering edit mode

there is a lot of info about the disease ( infertility) and a lot aobut the gene but not together . Would you consider a MAf of 5% for the study population and 1% for the control a good start?

ADD REPLY
0
Entering edit mode

I would just test varying combinations of allele frequencies for the population and controls, and come up with multiple different sample size estimates. That way, the reviewer, etc., cannot be too critical.

ADD REPLY
0
Entering edit mode

Ok thanks . With this percentages the sample is acceptable even low around 300 cases per group . Is there a standard/ a basic or more frequently starter numbers ? to include that ones in between. thanks again

ADD REPLY
0
Entering edit mode

I have seen 300 for a few studies. The larger studies have 1000s of samples,of course. There is no right or wrong answer, but obviously you cannot really do much with 10 or 20 samples.

ADD REPLY
0
Entering edit mode

Regarding the MAF : Is there a standard/ a basic or more frequently starter numbers ? to include that ones in between.

ADD REPLY
0
Entering edit mode
5.7 years ago

You can try BLASTN at ENSEMBL to observe/retrieve related sequences and look for variants perhaps you could observe natural variants or EMS induced variants from ENSEMBL server. If you have VCF file, you can predict variants with SIFT score. Try this: A: Allele frequency visualization

ADD COMMENT
0
Entering edit mode

Thanks! I will see if I can follow since I don;t work in the field.

ADD REPLY

Login before adding your answer.

Traffic: 2362 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6