Anyone knows how? Should I take the average of all probes mapped to that gene?
Anyone knows how? Should I take the average of all probes mapped to that gene?
For other platforms, gene-level summaries generally use the median of probes, not the mean. You might also use a procedure called median polish.
Hi,
Perhaps you can clarify what you mean by probes targeting the same genes? With the Illumina technology you have multiple (~20) beads that have the same probe sequence and these are known as a 'bead-type'. Combining beads for the same bead-type is done automatically for you in GenomeStudio, or in beadarray with the summarize function.
A gene can have multiple bead-types targeting it. However, averaging bead-types that target the same gene isn't really recommended for Illumina data. When a gene has multiple bead-types targeting it, the bead-types can target different isoforms of the same gene. Some isoforms might not be expressed (or have badly-designed probe sequences), so by averaging you would be diluting the signal of those isoforms that are expressed. If you want to have the data on a per-gene basis, I tend to pick the 'best' probe for each gene using a criteria such as the probe with the highest IQR. Or for a differential expression analysis should as limma, I would fit the model to all probes (probably after some filtering to remove non-informative probes) and then map to genes afterwards.
Hope this helps,
Mark
I have been trained to show all the probes for that gene with the gene name next to it:
eg,
http://stemformatics.org/expressions/result?graphType=box&datasetID=2000&gene=stat1&db_id=56
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Hi Mark, thanks a lot for your answer. Based on your explanation, I think bead-type would be more suitable word to say than probe.