Question

How To Calculate Fold Change And Its Significance In Quantitative Proteomics

0

Entering edit mode

13.5 years ago

Woa ★ 2.9k

I would like to know the (statistical) methods that are commonly used for measuring diffrential protein expression in quantitative proteomics study, particularly when replicates are involved in 'treatment' and 'control' groups. In the diagram, the numerical values indicate protein expression levels in some arbitrary unit.

Link to Diagram

proteomics statistics • 12k views

ADD COMMENT • link updated 13.3 years ago by Julian ▴ 200 • written 13.5 years ago by Woa ★ 2.9k

score 4 · Answer 1 · 2011-06-01

In addition to what Laurent has answered, I would say that "common" statistical methods are only just beginning to be applied in Proteomics. One of the key areas I look to for statistical methodology is the Microarray field, although you have to have a bit of care with assumptions when dealing with quantitative data.

One of your major choices will be how to extract the data into suitable figures - something like one of the standards may be a good way to go (e.g., mzXML, mzML, mzIdentML). However, be aware that a lot of Mass Spec. machines automatically process the data prior to actually outputing any data. As a result, you have to ask what is raw data, and what the values you get out actually mean. The only hope is that all proteins in your samples will be treated in the same manner (providing you have used the same Mass Spec. to gather all the data from your samples).

Then as Laurent says, you'll need to consider normalising across all your data sets to account for differences in amounts of protein put on a gel, or on a LC, or whatever you have done with your sample. You might in your experimental design have considered these with putting in standards, etc.

After all that, you might also want to look at doing some principle component analysis, to ensure that the differences in your samples are principally down to your "observations" and not some other factors such as when the samples were processed.

From that, you might want to set up some anova's (or if appropriate, just a simple t-test), but make sure you are making the right assumptions with your data and experiments.

In terms of software - here is what I use: if you have money, you might want to consider Non-linear's Progenesis - it's not perfect but it allows you to take a look at the data with some statistical rigor. It is quite constricted in the pathway for looking at data, but seems to work well especially with using the raw data. If you don't have money, look at some of the R stuff (particularly within the BioConductor package) - it's always progressing in terms of development and if you can get the data in the right format it works well, although it can be a handful if you aren't used to command-line processing. Also worth a look is software like MaxQuant from the Mann lab - but getting data into it can be a trial. When last I played with it, it didn't easily support Mascot identifications. If you are doing things like SILAC, then there is also software like SILACAnalyzer, which is part of the OpenMS suite of tools for proteomics data analysis.

score 2 · Answer 2 · 2011-05-31

This is a bit of an open question. You don't need any specific method tailored for quantitative proteomics data. Any appropriate statistical method should perform well as long as the basic requirements are expected. You might also think about data normalisation and possibly quality control.

You might get more precise answers by more people if you provide more details about your data, how you has been generated and how you performed quantitation.

Hope this helps.