On the document for MutSigCV: http://software.broadinstitute.org/cancer/software/genepattern/modules/docs/MutSigCV
I know you can use the datasets come along with the software, but it is not going to be the best if you can provide details from your own data. So I am trying to provide information from myown data.
But this is very confusing. How are these defined?
CpG transitions
CpG transversions
C:G transitions
C:G transversions
A:T transitions
A:T transversions
null+indel mutations
7 is clear. (1) How is CpG defined? Is it a CpG as long as ref_allele is C/G and it has adjacent nucleotide G/C, or it has to be CpG island? (2) What are C:G and A:T?
If I look at the data set comes along with software:
exome_full192.coverage.txt
gene effect categ coverage A1BG noncoding A(A->C)A 12
A1BG noncoding A(A->C)C 14
A1BG noncoding A(A->C)G 15
A1BG noncoding A(A->C)T 9
A1BG noncoding A(A->G)A 12
A1BG noncoding A(A->G)C 14
A1BG noncoding A(A->G)G 15
A1BG noncoding A(A->G)T 9
the categ column is not consistent with how it is defined in other input datasets.
What is coverage here? Is this tumor alternative count? The documentation is so confusing.
There is a tool CovGen which can be used to generate the sample/experiment specific coverage files as required by MutSigCV. I have not used this tool yet, but I will try it soon. In the meanwhile, if you have figured out the way to run the MutSigCV, please post any suggestions which might be useful for others.