Is it possible for two different Affymetrix probe set IDs to have common annotations to a single gene ? I am looking for the concept behind Affy probe set IDs. Any literature or links ?
Is it possible for two different Affymetrix probe set IDs to have common annotations to a single gene ? I am looking for the concept behind Affy probe set IDs. Any literature or links ?
Different probesets are certainly capable of mapping to the same gene on the standard Affymetrix GeneChip platform. Groups of probes are combined into probesets and multiple probesets MAY exist for a gene
NetAffx is the Affymetrix clearing house of Affymetrix probe ID info: [http://www.affymetrix.com/analysis/index.affx]
You might be interested in the BrainArray Custom CDFs which reannotate and regroup Affymetrix probes and probesets which are kept more up to date [http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/genomic_curated_CDF.asp]
They also have tools for mapping probesets between chips/species [http://brainarray.mbni.med.umich.edu/Brainarray/Database/ProbeMatchDB/ncbi_probmatch_para_step1.asp]
And interestingly a resource I have only just found called ADAPT which "describes the many-to-many relationships between Affymetrix™ probesets transcripts and genes, by directly mapping every probe against publicly available mRNAs/cDNA sequences from RefSeq and Ensembl."
As previously stated in some of the excellent answers above it is not just possible, but common.
We have our own system for 'validating' the mappings between affy probesets and transcripts.
Recently I have worked most with the Affymetrix Drosophila 2.0 chip-set and we find about 5% of probe-sets to be unreliable. Most fail because they are promiscuous i.e. one probe-set maps to more than one gene/transcript.
Hi Khader. You have hit on something really pretty important. There are times as you have illustrated where two different probe-sets behave differently even when they map to the same gene. This is obviously a bit of a worry so when this happens I like to try to work out if there is a sensible explanation, if I cannot find one the best you can do is either flag them with a warning or exclude them. As a simple rule of thumb I would check whether either (or both) of them are promiscuous, secondly check whether one or both falls across a splice junction..(continued, below...)
....if they do you may be getting a different (or weighted) change in expression between the probe-sets based on the differential expression (or stability) of the splice variants. This is where you need to apply a bit of biology 'nous' to understand something about the mapped genes themselves. One other possibility is that for example if it was profile data the probe-sets may have the same expression shape, but just a different magnitude. If you think this is possible you could try unitising all of your expression vectors (i.e. giving each a length of one) and then seeing if they converge.
Dear Ian, your method sounds very similar to ENSEMBL's (described at http://www.ensembl.org/info/docs/microarray_probe_set_mapping.html). I've been wondering why ENSEMBL's annotations haven't been mentioned here yet, since they seem to be a very obvious and up-to-date source. Am I missing something?
Thanks for your note Ian. I have noticed that differential expression levels in these probe sets mapped to same gene. For example, I noticed a particular probe set ID 'x' is up regulated in 'cases'. But in 'controls', instead of this probe set ID 'x' another probe set ID 'y' which mapped to same gene is upregulated. I am a bit confused If I can consider them as a differentially expressing hit. Or results should be reported only based on the consistent regulation of probe IDs. Please let me know your thoughts.
Hi Khader.
You have hit on something really pretty important. There are times as you have illustrated where two different probe-sets behave differently even when they map to the same gene. This is obviously a bit of a worry so when this happens I like to try to work out if there is a sensible explanation, if I cannot find one the best you can do is either flag them with a warning or exclude them. As a simple rule of thumb I would check that neither (or both) of them aren't promiscuous, secondly check whether one or both falls across a splice junction..(continued, below...)
....if they do you may be getting a different (or weighted) change in expression between the probe-sets based on the differential expression (or stability) of the splice variants. This is where you need to apply a bit of biology 'nous' to understand something about the mapped genes themselves. One other possibility is that for example if it was profile data the probe-sets may have the same expression shape, but just a different magnitude. If you think this is possible you could try unitising all of your expression vectors (i.e. giving each a length of one) and then seeing if they converge.
Yes, many probe sets associated with the same gene.
Here is the technical documentation from Affymetrix on probe set design: Transcript Assignment for NetAffx™ Annotations
yes Multiple probeset ids maps to a single gene, for type of annotations used to define a probeset id, go thru ADAPT.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Daniel : Thanks a lot for the detailed reply, I appreciate it.