Dealing with outliers for differential expression in proteomics
0
0
Entering edit mode
7 weeks ago
Jorge HB • 0

I have 15 samples, 3 replicates per condition, with intensity values for ~ 9500 proteins. Samples have some missingness (worst case 15%) and imputation has been performed prior to the differential expression analysis.

Taking a look to my imputed data in a pca plot, I consider there are some outliers that may bias my DEA results:

After looking to my DEA results, I find that the contrasts with the red group may be a bit inflated. For example, comparing red group with green group, I get ~ 1500 significant proteins. The people in charge of the project would prefer not to eliminate any replicates due to the small sample size for each condition.

Is it a valid approach if i run DE analysis with all samples, then another one but removing those two outliers, and keep as significant those proteins that overlap for the analysis?

Thanks in advance for any suggestion.

proteomics PCA DEA outliers limma • 291 views
ADD COMMENT
1
Entering edit mode

There is a couple of options other than removing them:

1) Use sample weights, for example arrayWeights() in limma (see its manual) to downweight outliers in a data-driven fashion.

2) Include the replicate information into the design, basically treating each replicate as a batch.

3) Use something like the sva package to estimate surrogate variables which capture unwanted variation, and then include these into the design.

There seems to be a clear condition difference, so I would start with 1) since it is easy and quick to do, and then see what comes out.

ADD REPLY
0
Entering edit mode

To determine if imputation might induce this effect, did you perform a PCA on common proteins before imputation look like ? You can easily achieve it with limma::plotMDS().

ADD REPLY

Login before adding your answer.

Traffic: 1695 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6