Entering edit mode
8.1 years ago
marwa.cs91
▴
30
Hello all,
I used a dataset of patients and normal individuals. This dataset contains 1560242 SNPs after quality control. I used the genome-wide significance (p < 5e -8). The number of SNPs after using the threshold (p < 5e -8) are very small. How can I use another proper threshold to get more SNPs ?
thanks
Hint: correct for multiple testing.
Thanks for your important hint. I used plink --adjust command to know FDR significance values. How can I know the significance threshold of FDR? thanks
It's the same arbitrary threshold as with p-values.
can I use the threshold FDR <0.05 Is it true ?
thanks
The FDR represents the expected number of false positives in your data so if you're happy with 5% false positives then 0.05 is a good threshold. It depends on what you want to do.
Thanks Jean-Karim for your description . My dataset after quality control contains 1560242 SNPs but after using genome-wide significance(p < 5e -8) , the number of SNPs became very small . I tried bonferroni correction (0.05/total number of snps) but these threshold gave also very small number of SNPs.
Could you tell me suggestion.
Thanks and I really appreciate any help you can provide
The Bonferroni correction is generally considered too conservative for genomics studies which is why the FDR is usually preferred. How you choose the threshold on what you consider significant depends on what you want to do with the SNPs afterwards. Another way of looking at it is to consider the trade off between false positives and false negatives. If you don't care about false negatives but don't want false positives, only keep very low FDR-corrected p-values but if on the other hand, you want to minimize the risk of missing relevant SNPs, you have to accept more false positives by using a higher cut-off. Having few significant SNPs is not necessarily bad e.g. when you need to pursue each one with further experiments.
Thanks Jean-Karim for your detailed description and your time