Hi sagardesai91
Just create a new vcf with the header below
##fileformat=VCFv4.2
#CHROM POS ID REF ALT QUAL FILTER INFO
chr1 10002 MU43280717 A T . . MELA-AU
chr1 10026 MU75019506 A G . . PBCA-US
chr1 10074 MU121369972 A G . . PBCA-US
chr1 10080 MU121498435 A G . . PBCA-US
chr1 10085 MU121369537 T G . . PBCA-US
chr1 10086 MU121375628 A G . . PBCA-US
chr1 10087 MU121380000 A G . . PBCA-US
chr1 10091 MU121508239 T G . . PBCA-US
chr1 10098 MU121433300 A G . . PBCA-US
chr1 10108 MU15348322 C T . . LUSC-KR
add "." (dot) in evey field for ID(in the example I have specific ID) QUAL FILTER and in INFO field put the name of the sample (A1 A2 A3 A4 A5 B1 B2 B3 B4 B5 C1 C2 C3 C4 C5) that the mutation belongs to.
Then you can use SomaticSignatures in R to plot all signatures per sample in once:
library(SomaticSignatures)
library(BSgenome.Hsapiens.UCSC.hg19)
library(ggplot2)
vcf <- readVcf("sample.vcf", "hg19")
vr<- as(vcf, "VRanges")
sca_motifs = mutationContext(vr, BSgenome.Hsapiens.UCSC.hg19, unify = TRUE)
To plot the signatures and use the common colors you find in cosmic and publications use:
plotMutationSpectrum(sca_motifs, "INFO", colorby = c("alteration"), normalize = TRUE) + labs (x="\nsignature", y="contribution\n") + theme(axis.text.x = element_text(size = 10)) + theme(axis.text.y = element_text(size = 10)) + theme(axis.title.x =element_text(size=12,face="bold"), axis.title.y =element_text(size=12,face="bold")) + scale_fill_manual(values=c("lightblue", "black", "red", "grey", "darkolivegreen3", "lightsalmon"))
If you want to get the values in matrix and then plot it or use it somewhere else do:
sca_mm = motifMatrix(sca_motifs, group="INFO", normalize = TRUE)
In case you want to change the order of your samples you can do this after mutationContext function:
sca_motifs$INFO<-factor(sca_motifs$INFO, levels = c("A1", "B4", "B1", "C5",.......,etc))
Finally if you want to plot raw numbers (not normalized in %), change normalized value to FALSE
This is amazing, I'll try it out and let you know if it works, thank you so much!
Hey the confusion remains, what do I do if one particular mutation is present in more than one samples?
Report them in the vcf file but with different INFO section (A1,A4,B3)
when you are gonna plot them, SomaticSignatures will take into account these variants separately, because plotMutationSpectrum function will plot the according to the INFO column
e.g.
So no worries!!!!
Okay, so If I've understood this correctly, my input vcf file, with mutations present in multiple samples will look like this: (Just as an examples)
am I right?
oh no no, I got it! Sorry for the trouble, thanks a ton!
I don t know because your post is a disaster...hahahaha. Can you please edit it using the code button.
If I understand correctly is wrong
The correct should be
Hahaha, I am extremely sorry about that post, but yes I got it :D