Affy Axiom array - many cases of two probe set IDs sharing the same Affy SNP ID and dbSNP RS ID values. What's going on?
0
0
Entering edit mode
9.6 years ago
devenvyas ▴ 760

Hello,

I recently got back some genotyping data back from the core lab from the Axiom Human Orgins array, and I have been trying to analyze it (trying being the operative word). I have been having some major frustrations.

I was trying to run a PCA in R when I noticed that a few thousand of the rows were pairs sharing the same rs ID. (In case it is relevant, the table was constructed with rs IDs making up the rows with each column being a sample genotype coded as 0/1/2). I noticed this was the case within Genotyping console also (as well as when I export data out of it). There are many cases where there are two Probe Set IDs for the same Affy SNP ID and dbSNP RS ID values...

Anyone know why is this? Is this indicative of the exact same site being genotyped twice? How do I filter these cases out, so that when I export genotypes I am not getting duplicates of the same loci? Thanks!

SNP Affymetrix • 4.0k views
ADD COMMENT
0
Entering edit mode

First stop would be to check your annotation. Have you looked a few of the SNPs up on Netaffx (the Affymetrix support site, go to affymetrix.com) to see what their current official annotations are? I suggest you pull down the latest annotation file if you don't already have it to make sure that you don't have a mangled copy.

ADD REPLY
0
Entering edit mode

I downloaded the annotation through Genotyping Console. When I started using the dataset, it asked me my Affy username (=academic email) and Affy password, and downloaded the latest files. I downloaded the annotation file manually via browser to confirm the issue, and the duplicated Affy SNP ID and dbSNP RS ID values exist there too.

David Reich's group helped design the array, so I looked up their published technical note and found this

For a very small fraction of sites, we found that the derived allele is different depending on which human is used in SNP discovery (these are potentially triallelic SNPs in the population, although they are not triallelic in the discovery individual). We keep such sites in our list of SNPs for designing, and use multiple probe sets to assay such SNPs.

But when I go through the annotation file, these duplicated probe sets are all bilallelic. Any clue what is going on?

AX-50582098     Affx-33256817     rs10760419     TRUE     2     9      T     C     T     C
AX-50582099     Affx-33256817     rs10760419              2     9      T     C     T     C
AX-50023441     Affx-3561055      rs10762231     TRUE     2     10     A     G     G     A
AX-50023442     Affx-3561055      rs10762231              2     10     A     G     G     A
ADD REPLY

Login before adding your answer.

Traffic: 1826 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6