Entering edit mode
4 months ago
SHN
▴
40
Hello All,
I have a quick question regarding a proteomics dataset that is generated by Maxquant/Fragpipe (DI-AN) software. I have two rows for one protein that each row is assigned to a peptide.
portID. sequence Treatment1. Treatment1. Treatment2 Treatment2
protein1 AADDQILFDT 156535 156345 314642 112323
protein1. AADDQILFDTR. 128766. 165265 312767 102654
The example of the file is as above. as the two peptide sequences are almost the same for protein1, I want to count them as one and combine the intensities for each treatment. I was wondering what is the rule for this calculations? Do I need to sum them up simply or is there any function as collapse samples in DESeq for the RNA-seqs for the protein seq datas?
Any insights will be appreciated.
There is a chance that this is an XY Problem and you just would like to perform differential expression/abundance analysis. If that is the case I would take a look the
proteinGroups.txt
, this should be the part of the MaxQuant output. Here peptides would be summarized to proteins already.I am using
nf-core/differentialabundance
for proteomics and here the pipeline is expecting this file as an input. I know for a fact that there areBioconductor
packages that can utilizeproteinGroups.txt
but I don't know them by heart.Thanks. I am using the DIA data output and would like to run the DE at the peptide level to see if there is any difference in the expression of that peptide for that protein for the given modification. but I have some sequence data which as you can see, there is one amino acid difference between them with the same modified position. So I am not sure if I have to sum up the peptide values or to run the analysis at the peptide level?