Hi everyone, I know the concept of multiple probe ids matching to single gene but, Is it possible that single affymetrix probe id matches to different genes in microarray?
Hi everyone, I know the concept of multiple probe ids matching to single gene but, Is it possible that single affymetrix probe id matches to different genes in microarray?
[?]
For instance you have the probe 207739_s_at that match several genes (> 10) of the [GAGE][1] family. [?][?] You have also the probe 217365_at that match several member genes (> 5) of the [PRAME][2] family.
[?]
[?]
[?]
This is definitely the case, a single probeset can contain a majority of probes which map to more than one location in the genome.
So I used SCAMPA, http://web.bioinformatics.ic.ac.uk/scampa/section.html?id=5
To do this, the tool has pre-defined thresholds for each of its levels, but you should be able to hack the source to define them yourself.
Of course, these corrections are sensitive to the genome-build you are using.
HTH
It depends to some degree on the array platform but yes, to reiterate what has already been said, probes can match to more than one location. This can be due to duplication within the genome: for example, the 5'-end of the X chromosome is very similar to the Y chromosome, so probesets such as 218951_s_at
(from the HG-U133A platform) match both.
There are tools to deal with this, but one approach is to download the relevant data from e.g. UCSC or Affymetrix and process it with a custom script to remove probesets with > 1 location.
Recently published tool GATExplorer was very useful for me to do probe to gene mapping. You can either use their server or download the files for microarray probe mapping.
Yes, and this can be annoying when doing downstream annotation like KEGG pathways, etc. For example, if one probe says MapK is going Up and another says MapK is going down, how should I annotate this on a "gene-centered" graph like GO or KEGG.
My solution has been to use the "BrainArray" custom CDFs. These are created, and updated weekly, to reconstruct the affy-probesets so that each probeset matches a SINGLE ID. They have a build for UniGene (every probeset matches to a single UniGene ID), Entrez Gene, Entrez Protein, and dozens of others.
I've found that this make my downstream annotation much easier when I'm dealing with gene and protein level annotations. The only problem is that you need to have the RAW CEL data to use these CDFs.
Hope that helps, Will
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
expression array probe sets shouldn't really be mapped to the genome though - as those arrays detect RNA signal, right?
but those RNAs have to come from somewhere ... and if the probe hits exons from different genes then this definitely poses a problem due to reliable measurements