Repeated ensembl IDs in microarray DEG analysis
0
1
Entering edit mode
10 weeks ago
Pereira G ▴ 10

Hello,

I'm working on a mice microarray dataset (GPL8321). I've annotated the dataset using the affytools annotateEset function and proceed with the limma pipeline for differential expression. However, looking at the genes names of the DEGs, I noticed that some genes were duplicated, with different expression values obtained. Looking further, I also noticed that this GPL have a great number of probe ids that map to the same ensembl id multiple times.

eset <- rma(celdata)
eset <- annotateEset(eset, mouse430a2.db, columns = c("PROBEID", "ENTREZID", "SYMBOL", "GENENAME", "ENSEMBL"))

table(duplicated(fData(eset)$ENSEMBL))
FALSE  TRUE 
13113  9577

My question is, the best practice should be to remove the duplicated ensembl IDs before the differential expression anaylsis? This high number of duplicates wouldnt interfere with the statistical analysis and p-value computation?

Should this be handled by computing the mean value of the probes that map to the same ensembl? And how can I achieve it on a Large ExpressionSet object (eset)?

R annotation DEGs microarray limma • 197 views
ADD COMMENT

Login before adding your answer.

Traffic: 2357 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6