Proteomics Data Analysis
1
0
Entering edit mode
7 weeks ago

I am analyzing mass spectrometry imputated proteomics data, and I have tried different options. I have tried VSN, Quantile (with additional log2 data transformation) and log2 normalization with NormalyzerDE. The results in all cases show a really high number of differentially expressed proteins and rare volcano plots distributions. The only good method was Quantile + log2 transformation, where I found a normal number of differentially expressed proteins (When I speak about huge differences I talk about 800-1000 differentially expressed proteins in the other methods and around 100-200 proteins with Quantile + log2 transformation). I would like to know if this is possible doing a correct analysis. I used to analyse proteomics data only transforming to log2, withoud any extra normalization and it used to work well. Appart from that, I have observed that if I analyse these data with limma, the results are much worse than with Wilcoxon test (I think that this happens because most protein do not follow normal distribution and a non-parametric test should be more accurate).

Proteomics • 506 views
ADD COMMENT
0
Entering edit mode

Please 1) don't open multiple questions on the same problem -- this here is just a recap of Proteomics DEA and dilutes information. 2) Please use ADD COMMENT to reply to answers -- that keeps the thread organized.

ADD REPLY
0
Entering edit mode
7 weeks ago
Gordon Smyth ★ 7.8k

There's no reason at all why you shouldn't get 1000 DE genes for a proteomics dataset. Not sure why you would see that a problem in itself. It is not correct to judge a method as better or worse simply by how many DE genes it produces, unless of course you know the true results in advance, as for simulations or calibration datasets.

You say that the data is not normal, but limma does not assume a normal histogram of data. I wonder how you have assessed normality. Do you mean that the data shows outliers?

ADD COMMENT
0
Entering edit mode

But if you have 3000 exosomes proteins detected, having around 900 DE proteins is a little bit weird, no?

With respect to normalisation, I performed a shapiro test for every protein before deciding which test to use. Data do not show outliers, but show really differents results when I perform:

1) Quantile normalisation + log2 transformation; 2) VSN / only log2 transformation

ADD REPLY
0
Entering edit mode

Shapiro's test is not applicable or necessary in this context.

I don't see anything strange about 900 DE proteins if the treatment conditions are quite different.

Personally, I'd be much more worried about checking the imputation method, as some imputation methods do cause liberality in the DE analysis. I use the limpa package ( https://github.com/SmythLab/limpa ), which works well with DIA-NN or Spectronaut.

ADD REPLY

Login before adding your answer.

Traffic: 1194 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6