Hello,
I am thinking of designing an algorithm to rank variants in order of most to least likely to have a functional effect in a autoimmune disease.
The problem is, if I wanted to train such an algorithm, it would be helpful to have a list of variants that are causative, or almost certainly causative, of an autoimmune disease.
For instance, in the Rheumatoid Arthritis literature, there are a lot of associated SNPs, but we are reasonably sure that amino acid positions 71 and 74 of HLA-DRB1 actually CAUSE the disease, rather than being an associated SNP in LD with a SNP that actually increases or decreases risk of the disease.
It can be any autoimmune disease, but mendelian diseases etc will not help. The goal will be to mine characteristics of these causative SNPs.
I am aware of algorithms like CADD, PICS, etc., I am more in need of a list of validated causal snps than anything else.
Does anyone know of such a source?
-------------ADDENDUM - someone posted below the NHGRI catalog-------------------
The problem is that the NHGRI catalog lists associations, not (necessarily) causal variants.
To determine a causal variant, we need follow up experiments in the wetlab to show necessity and sufficiency. It is thought that the lead SNP is actually causal in 10% or fewer of cases (Farh et al 2015).
I am looking for a resource only for the latter type of SNP.
see also: Database For Causal Genetic Variants
Did you find any good solutions? I am in need of the same type of resource. I have found a handful of articles where people have done functional validation of specific GWAS-associated alleles to give evidence of one being likely causal (using e.g. luciferase reporters, allele-specific binding with TF ChIP-seq, chromosome conformation capture). These are not mainly for autoimmune diseases though. I'd be happy to share my list if you're interested.
Hello, thank you very much for the note.
I have not. There are various sources for these, but nothing approaching what could be called comprehensive. This post is also helpful Database For Causal Genetic Variants I would be happy to take a look at your list if you do not mind sharing it. I can be reached at this ID at gmail. From there maybe we can start a dialogue and share notes? I have not done any manual redaction of individual papers. Only accessing different databases, so that's one caveat. Thanks again.
Hi, I met the same question like you, have you found an good solution? or have you finished your algorithm, which really attracts me.