You measured a lot of mass spectra and you get matching peptides and inferred proteins via search engines or spectral libraries or both. What tools are you using to cross-check the data? Delta m/z differences, precursor charge assignments, PTMs...
You measured a lot of mass spectra and you get matching peptides and inferred proteins via search engines or spectral libraries or both. What tools are you using to cross-check the data? Delta m/z differences, precursor charge assignments, PTMs...
I think that's a very good question, and I would argue that the few answers somehow reflect the lack of well established QC methodology in proteomics (although there might be few people working in that field on Biostar), especially compared to genetics and transcriptomics. A precise answer will also depend on the sample (simple or complex mixture) and it's processing (enrichment for instance).
Probably the very first thing to do is to look at the raw data, as it is produced from the mass spec, i.e the elution profile and the raw spectra. I am always amazed how our in-house mass spec specialist can comment on the raw data and quickly assess how good the results are, or at least if the data is good enough for the question considered. To do this, you really need to know what you are running in the first place and be aware of the capabilities of your machine. Mass spectrometry is still a hands-on experiment, in comparison with more mature technologies (and technically easier) like microarray. Of course, all this requires to be where the data is generated, which might not be the case if you work as a bioinformatician and take care of data repositories for example.
IMHO, there is need for more QC steps because (1) not every body has an expert to ask and (2) having automated pipelines, that statistically asses QC for single or multiple data sets, is crucial. I think that delta m/z differences, precursor charge assignments, PTMs, MZ distributions,... as you mention, are a good start. Still , it would be important to formalise the knowledge of the mass spec gurus and implement it in programs. And btw, PRIDE inspector is a good means to quickly assess public data for meta-analysis.
Finally, you might be aware of a recent special issue of Proteomics about QC. I have not had time to read it thoroughly, so I can really point to any specific method.
Hope this helps.
In case of Trans Proteomics Pipeline (TPP) which is a opensource and free collection of tools and supporting data formats which enable shotgun proteomics data analysis. If we go through TPP then we can find several validation tools in the pipeline.
In the figure you can see nodes: Protein prophet and Peptide Prophet which are validation tools for the mass spectra data which can be either searched from SEQUEST database in the TPP.
Hope it helps
Take a look at "Performance Metrics for Liquid Chromatography-Tandem Mass Spectrometry Systems in Proteomics Analyses" (http://www.mcponline.org/cgi/pmidlookup?view=long&pmid=19837981). They also make available a software pipeline that implements their recommended tests.
This is what we use. It is available from http://peptide.nist.gov/metrics/. It has just about every metric you can think of and then some. This is useful for assessing mass errors, charge state distributions, peak widths, etc. Very useful for determining when your LC or MS performance is changing.
To "cross-check" identification assignments theGPM's validate function is unparalleled. Search data using their servers or your own GPM installation, in the results click the protein, the peptide, the validate link. This will bring up a list of the top ten assignments to that peptide in the GPMdb.
The following papers may be relevant to your question:
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
tempted to mark this question as the accepted answer :)
tempted to tag this as a great comment ;-)
I'm sorry not the 'question' but your answer. What I really had in mind besides the tools was a question concerning the problems of QC in proteomics in general but you obviously got that message too.