Multiple protein ids issue- which protein IDs should be selected for downstream analysis?
1
0
Entering edit mode
4.8 years ago
harelarik ▴ 90

Inspecting proteinGroups file (results from MaxQuant): There is column named "Majority protein ids". According to Tyanova et al., (Nature_protocols_VOL_1_2016) this column contains at least half of the peptides assigned to a protein group. Thus, this column often contains multiple protein IDs per entry.

In case of multiple protein IDs per one table cell, which protein ID should be selected for downstream analysis? Mainly for assignment of GO ids, and calculation of GO enrichment. 1. Is it better to take the first protein ID in each table cell, which should be the best one, as they are sorted according to the total number of identified peptides? OR 2. Is it better to take all protein IDs in each table cell? This way we are accounting for simultaneous translation of paralogs (while over representing those that were not translated).

Thank you,

Arik

Proteomics enrichment persus • 1.0k views
ADD COMMENT
0
Entering edit mode
4.8 years ago
harelarik ▴ 90

I was adviced by an expert in the field that: All protein IDs in the "Majority protein IDs" column should be used. If only one protein id is selected, the concern is that if we select a poorly annotated protein we will miss many of the annotations.

IF we want to assign annotations (e.g., GO ids) to an entry in the ProteinGroups file (i.e., one row in the table) which has multiple protein ids, we should take all GOids associated with all of the protein ids in the "Majority protein IDs" table cell. Than, each GO id is counted only once.

ADD COMMENT

Login before adding your answer.

Traffic: 2138 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6