How Many Genes Differentially Expressed In Microarray Can Be Seen As Normal?
2
4
Entering edit mode
14.6 years ago

Hi, I have a time course (0h,24h,48h,72h,96h,144h after sexual stage induction) microarray datasets about Gibberella zeae, a plant pathogen, in which about 14000 coding protein sequences,after analyzing microarray with SAS proc mixed procedure, I find about 5000 genes differetially expressed in total of these time course, is it normal? I really hope somebody can give me some suggestion. Thanks.

microarray • 4.7k views
ADD COMMENT
0
Entering edit mode

When considering this question, you must keep in mind that many of the methods for normalization of microarray data assume that only a small fraction of genes are differentially expressed. So even if a large fraction of genes actually exhibit differential expression, your analysis pipeline might not handle this data well, and you might get unpredictable or nonsensical results.

ADD REPLY
6
Entering edit mode
14.6 years ago

Last year a paper suggested that nearly all genes are transcriptionally regulated during plant infection.

I think this might actually be the case for all organisms. When something happen the whole transcriptome is slightly regulated. Some genes have drammatic change, the other simply "adjust" to the new "state".

The fact is that, usually, you can show that only a few genes are regulated because to pass a statistical test you need either a big shift in mean expression value or many many replicates. And given the cost of microarray, the latter is rarely possible, so you end up "seeing" only those that have big swings in gene expression. Furthermore you need to correct for multiple testing, and to make sure you don't have too many false positive, you end up having many false negatives.

The above mentioned paper had 72 (!) biological replicates because it was the collection of all "controls" of a massive experiment and so their statistics is very powerful.

If you have many replicates and/or the biological replicates are very homogeneous, you might find many genes that result regulated.

ADD COMMENT
0
Entering edit mode

Ok, thanks very much. Your suggestions are really helpful to me. In fact, my microarray only have four replication, in this way, there would be much noise among them, through I used FDR and fold chang >2 as a cutoff value. All of this procedures are carried out with SAS proc mixed. I will try to use other methods to reanalyze my data, maybe by comparsion, I can avoid some big mistakes.

ADD REPLY
0
Entering edit mode

Great info - 72 replicates, that's very impressive and produced some eye opening results

ADD REPLY
4
Entering edit mode
14.6 years ago
User 59 13k

There's no metric for a 'normal' amount of genes differentially expressed in a microarray experiment, this is going to vary massively depending on your experimental conditions. I've seen very well replicated experiments that have 1000's of differentially expressed genes detectable in a very robust fashion, other very targeted experiments (siRNA knockdowns) in which only a handful of genes are perturbed.

Given that you're reporting a number of genes differentially expressed 'in total of the time course' maybe you should be looking at changes between the timepoints as well as across the whole experiment?

The real issue is that dissecting a gene list 5000 genes long to get any more meaningful information is a bit more of a challenge than dissecting one 500 genes long.

ADD COMMENT
0
Entering edit mode

Actually, I use the 0h as a control, and make other treatments compared with it, further, I use a perl script to find the intersect part of these treatments. For example, 24h as A, 48h as B, and so on, I can have subsets like the following: A,AB,AC,AD,AE,ABC,ABD,ABE,CDE,.....ABCDE,is this making any sense?

ADD REPLY
0
Entering edit mode

I'd be more tempted to use something other than a perl script for doing venn diagrams. I'd seriously consider using something that allows you to set up meaningful contrasts (Limma in BioConductor for instance) to analyse this data. There are plenty of time-course specific packages for analysing time-course data. MaSigPro comes to mind as well: http://bioinformatics.oxfordjournals.org/cgi/content/full/22/9/1096 (that reference should prove to be interesting regardless of whether or not you use the methodology)

ADD REPLY
0
Entering edit mode

Thanks. I will follow your suggestion and reanalyze my microarray data.

ADD REPLY

Login before adding your answer.

Traffic: 2618 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6