Microarray Multimapping/Cross-hybridizing probesets
2
0
Entering edit mode
9.1 years ago
Nathaniel ▴ 120

I am analysing an Affymetrix Mouse Gene ST 2.0 array (http://www.affymetrix.com/catalog/131476/AFFY/Mouse+Gene+ST+Arrays)

Looking at the annotation file I have realised that some probes map to multiple genes. Should we remove them for the downstream analysis?

microarray • 2.6k views
ADD COMMENT
0
Entering edit mode
9.1 years ago

Typically these are different oligonucleotide capture sequences for the same gene at different positions (or they are for Illumina BeadChips, so I'm assuming it's a similar principle). Basically they're for the same gene at different points. You can map the probe IDs to nuIDs and convert the nuID to a nucleotide sequence, if you blast those sequences it should show the target region on the gene. Short answer, is keep them in the experiment.

edit: I read that too quick - Can you explain what you mean by them mapping to multiple genes? How did you determine that happened?

ADD COMMENT
0
Entering edit mode

Dear Andrew,

I think Nathaniel's post is similar to the general issue of also continuously evolving and changing annotation platforms and annotating generally microarrays through various annotation packages-- there is an example with affycore tools below (via select, etc):

https://www.bioconductor.org/packages/release/bioc/vignettes/affycoretools/inst/doc/RefactoredAffycoretools.pdf

In detail, it is mentioned also in the paper: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1283542/

One is the probe set redundancy, which is the case you mention above--but also there is the other case:

"Non-specific probes"

A significant increase of cDNA/EST/genome sequence information leads to the possibility a probe thought to be specific for one gene can actually hybridize to transcripts from additional genes or non-coding transcripts. As shown in Table 1, according to the current version of the UniGene database, for most GeneChips 10 - 30% of probe sets contain at least one non-specific probe. Probe alignment to genomic sequences also reveals that 5 - 16% of probe sets contain a probe(s) with more than one genomic sequence hit(s). The difference between the UniGene- and genome-based criteria may largely be due to UniGene clustering or EST sequencing errors.

Best,
Efstathios

ADD REPLY
1
Entering edit mode

Very comprehensive, thanks!

ADD REPLY
0
Entering edit mode

Exactly, in the microarray annotation file you can see a single probeset matching multiple gene tags.

To avoid any problem, I decided to remove all such probesets from the analysis, but I definitely not know what is the best way to go.

ADD REPLY
0
Entering edit mode

Ah, I take it these are "cross hybridising probes" - i.e. the capture sequences map to more than one gene, in that case I'd definitely remove them, as they'd most likely confound your results.

ADD REPLY
0
Entering edit mode
9.0 years ago
Nathaniel ▴ 120

""

ADD COMMENT
0
Entering edit mode

Unfortunately(for some reason I didn't get any updates for the new comments and answers) I believe Nathaniel you have probably moved along with this issue---just to pinpoint my opinion on this matter: there isn't a clear option on this specific issue, but you can move similarly to the above vignette in affycoretools and with the most "naive way" keep the first mapping like:

https://www.bioconductor.org/packages/release/bioc/vignettes/affycoretools/inst/doc/RefactoredAffycoretools.pdf

Alternatively, you could consider the above approach i have mentioned, and download and proceed with customCDF arrays--

Hope that helps,

Efstathios

ADD REPLY

Login before adding your answer.

Traffic: 2341 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6