I am working right now with VEP and I got SNPs from one external paper that has only the following information: snp, chr, pos,gene, risk allele, non-risk allele, frequency for a risk allele
From MAF I can identify the major alleles and the minor allele. Sometimes a risk allele is a major allele.
Example:
rs28411352 1 38278579 MTF1-INPP5B T C 0.26
Meaning that T is a risk allele and a minor allele.
rs12140275 1 38633879 LOC339442 A T 0.78
Meaning that A is a risk allele but T is a minor allele.
My question is what should I define for input for VEP as a reference allele a non-risk allele or a major allele?
PS. I also do not have a strand information but as soon as I define what a reference allele is, I can look up in the fasta file if my allele is represented with respect to positive strand or not.
Thank you in advance.
The strand of the alleles you give is specified in the 5th column, should be your reference allele