I've performed a GWAS and got an intronic SNP as the top hit. Thing is I have rna-seq data and the SNP does not seem to be affecting gene expression or splicing.
I'm wondering if the users know about any other mechanisms might explain an intronic SNP showing the highest association without affecting gene expression or splicing?
ADD COMMENT
• link
updated 9.1 years ago by
Prash
▴
280
•
written 13.4 years ago by
Paul
▴
760
0
Entering edit mode
Can you give a bit more details about your study? How many SNPs have you used in your GWAS? How well is your genotype? Is this a single GWAS hit or are close-by SNPs also tending to correlate?
It could just be that you've found a good hit on a tag-SNP
Hey Sander, the data is the HapMap III data so it's around 1 million SNPs. I don't think that it could be LD with another SNP as I've investigaged the sequence data from the 1000 genomes data for the samples also. So we also have rna-seq data to show that gene/exon expression aren't correlated with the snp.
The intronic SNP may be in high LD (linkage disequilibrium) with a variant that associates with translation of the mRNA. Allele 1 supports translation into functional protein while allele 2 hinders translation. In other words, it may be an association with protein expression. One version of this would be a synonymous SNP where there is a large difference in codon frequency - and tRNA availability and protein translation rate.
It seems unlikely in this case as the gene is directly related to the phenomonon I'm investigating, although it is true that often a functionally related RNA will be transcribed within a related gene, so this is also a possibility!
+1 because this suggests an interesting example from steroid metabolism in humans. MIR33A, an intronic miRNA not known previously in steroid metabolism and located within the gene encoding SREBF2, inhibits the expression of ABCA1, thereby attenuating cholesterol efflux to APOA1 (Rayner Fernandez-Hernando 2010 Science). When cells are depleted of cholesterol, both the transcription of SREBFs and the intron-encoded miR-33 rise modestly.
Isn't this the usual situation for GWAS? I don't follow this area closely, but from what I see in publications, most of the suspect SNPs appear to be at some place where they don't make any obvious sense.
But seriously, wow can you be so sure that there is no influence on transcription or splicing? Do you have one set of RNA-seq data per allele? Or many of them? I'm not an expert for RNA-seq, but with microarrays I would like to see dozens of replicates before drawing any conclusions.
I am not a GWAS expert either, but here is another thought: Having something as the 'top hit' doesn't mean that it is meaningful. I assume that you have sufficient statistical evidence for the SNP-to-phenotype association.
Finally, can you exclude that some other variant (that you did not test for) is causing the phenotype and that your intronic SNP is just linked to it?
Please excuse if these suggestions are rubbish, I am way out of my league here.
I can be very confident of the top hit, it appears in both populations assayed and survives permutation testing of phenotype labels which is a very severe form of multiple testing correction. Good call on the allele specific issue!
I am not sure if this thread is still alive. But here are my thoughts :-)
I suggest you crosscheck the allelic dataset by blasting your miRNA with all the miRNA data (say from a study RNA-Seq with publicly available datasets). I agree with Lyco that You might still tend to get lots of false positives unless you go for rigorous statistical evidence.
Even if you were to get only this intronic SNPs, I suggest to be more cautious in checking whether or not you observe such similar intronic SNPs from other GWAS datasets. The challenge would be if the miRNA is a SNP by choice or MNP by variation.
Just my two cents. I may be wrong!
Best
Prash
ADD COMMENT
• link
updated 5.0 years ago by
Ram
44k
•
written 9.1 years ago by
Prash
▴
280
Can you give a bit more details about your study? How many SNPs have you used in your GWAS? How well is your genotype? Is this a single GWAS hit or are close-by SNPs also tending to correlate?
It could just be that you've found a good hit on a tag-SNP
Hey Sander, the data is the HapMap III data so it's around 1 million SNPs. I don't think that it could be LD with another SNP as I've investigaged the sequence data from the 1000 genomes data for the samples also. So we also have rna-seq data to show that gene/exon expression aren't correlated with the snp.