How do people process spectral count data from protein mass spectrometry / proteomics data? The number of zero's in the data (most of which, due to the way the data is produced, you can't reliably define as zero). A standard t-test / anova doesn't seem appropriate as the data isn't normally distributed. The data follows a Chi-squared distribution.
I have spectral count data for each protein in samples from an experiment of a control v. test. There are 3 biological replicates for each. The data has been put through Mascot and Scaffold. I am looking, in an ideal world, for a p-value for each protein - not just across the samples. I am looking to produce some reliability on the proteins being selected for review (something over and above what the TPP / Scaffold are supplying).
I've been pointed to using the g-test and/or the Chi-squared test. It's also been suggested to use the Fisher test. I've tried these in R, without loads of success. I've tried using a spreadsheet in Excel to calculate a g-test value too. I've also used PepC. One of the particular issues I am having is generating the statistics on a protein basis as opposed to the sample basis.
I've also tried to look at some of the microarray processing techniques, but it appears to me that the large number of zero's in proteomic spectral-count data negates these approaches.
Any experience and/or advice would be greatly appreciated. Thanks.