Question

Some questions about the CPTAC data processing

1

Entering edit mode

4.3 years ago

tujuchuanli ▴ 130

Hi,

I am planning to identify mutations of amino acids on protein level by analyzing MS data from CPTAC. I have some questions about data processing

There are multiple mzML-format files for each sample. How can I merge these mzML-format files into one single file?
For breast cancer, three samples were merged for one round of MS detection. Can I separate these three samples based on the mzML-format files.
To identify mutations of amino acids on protein level, I applied the "sapfinder" from Bioconductor according to the previous studies (http://www.ncbi.nlm.nih.gov/pubmed/29706454). However, one err, "non-standard CODEC used for mzML peak data (CODEC type=zlib compression). File cannot be interpreted. decoded size 2289 and required size % dont match:", was always happened. How can I fix it?

Thanks

programming genomics protein • 1.2k views

ADD COMMENT • link 4.3 years ago by tujuchuanli ▴ 130

0

Entering edit mode

For the question 3, the X!Tandem can`t accept the mzML file with zlib compression. It will be OK with sapfinder if the mzML files were converted by MSConvert with unchecking the "Use zlib compression" option (http://proteowizard.sourceforge.net/download.html).

ADD REPLY • link 4.3 years ago by tujuchuanli ▴ 130

0

Entering edit mode

For question 1, It seems that FileMerger under OpenMS can merge multiple mzML files (https://abibuilder.informatik.uni-tuebingen.de/archive/openms/Documentation/release/latest/html/TOPP_FileMerger.html).

ADD REPLY • link 4.3 years ago by tujuchuanli ▴ 130