Some questions about the CPTAC data processing
0
1
Entering edit mode
3.7 years ago
tujuchuanli ▴ 130

Hi,

I am planning to identify mutations of amino acids on protein level by analyzing MS data from CPTAC. I have some questions about data processing

  1. There are multiple mzML-format files for each sample. How can I merge these mzML-format files into one single file?

  2. For breast cancer, three samples were merged for one round of MS detection. Can I separate these three samples based on the mzML-format files.

  3. To identify mutations of amino acids on protein level, I applied the "sapfinder" from Bioconductor according to the previous studies (http://www.ncbi.nlm.nih.gov/pubmed/29706454). However, one err, "non-standard CODEC used for mzML peak data (CODEC type=zlib compression). File cannot be interpreted. decoded size 2289 and required size % dont match:", was always happened. How can I fix it?

Thanks

programming genomics protein • 1.1k views
ADD COMMENT
0
Entering edit mode

For the question 3, the X!Tandem can`t accept the mzML file with zlib compression. It will be OK with sapfinder if the mzML files were converted by MSConvert with unchecking the "Use zlib compression" option (http://proteowizard.sourceforge.net/download.html).

ADD REPLY
0
Entering edit mode

For question 1, It seems that FileMerger under OpenMS can merge multiple mzML files (https://abibuilder.informatik.uni-tuebingen.de/archive/openms/Documentation/release/latest/html/TOPP_FileMerger.html).

ADD REPLY

Login before adding your answer.

Traffic: 1680 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6