Hi,
I am analyzing TF proximity mutations in ICGC cancer cohort and I would like to prove that upon TF binding mutation signature profile changes.
When I analysed TF proximity mutation with mutations signature analysis packages (such as SomaticSignature). It gives the "the famous" signature graph of 96 trip nucleotide proportions. Within my TF, I see that T>A mutations are enrichment but I also see that the sequence context of A,T bases in the binding region sequence context. I think I need to prove that this two events (enrichment is independent of sequence context) are independent.
I have seen that mutation signature analysis are mostly done in the tumor wise cases instead of certain areas. Could you enlightme is there kind of normalisation done when this analysis applied for specific regions ?
My questions are; 1) I am overwthinking ?
2) What statistical test should I use to prove Total Cancer Mutations vs TF proximity mutations are different? (like Kullback-Leibler divergence ?) But I couldnt incorparete the sequence context.
Thank you very much for your help,
Best,
Tunc.
hello we used cosine similarity for the comparsion in my second question. https://www.nature.com/articles/s41467-020-14644-y please see the method.
Also there are more packages came up after that time so you should be checking deconstructSig too.
good luck