I have two mutational spectra that I need to compare to see if they are significantly different from one another. Each column of the data below illustrates how many mutations were found at a given location in each sample. I am aware that Scipy has hypergeometric functions that will suit my needs but with my limited stats knowledge I am having a hard time distilling this raw information into the function to acquire the p-value.
Ultimately I am trying to write a script using Biopython/Scipy/Numpy to do what is outlined in the journal article Statistical Test for the Comparison of Samples from Mutational Spectra by W. Thomas Adams and Thomas R. Skopek in 1986.
Here is a sample data-set to work with that resembles the data I am looking at.
Pos. Sample 1 Sample 2
2 0 3
3 0 0
6 0 1
8 0 0
12 2 5
15 1 1
26 1 0
34 2 0
47 0 2
77 4 4
Let me know if more details need to be provided!
To give an update on my progress. I am developing a script to handle the Adams & Skopek algorithm to answer my own question. I will post my answer once it's complete and working.